Monthly Archives: October 2011

The Power of Social Media in Troubleshooting Issues

We recently finished the migration of 45TB worth of data from our existing and aging XIO Storage (Xiotech) and EMC Fiber Channel SANs (And an Apple DAS) over to a new Isilon NL-Series 5 node cluster we implemented in early September.

We migrated away from dedicated servers – a mix of Windows Server 2003, Windows Server 2008, Windows Server 2008 R2, and OS X Server (Snow Leopard) to the NAS functionality of the Isilon using a mix of SMB and NFS shares.

We started to hear grumblings a few weeks in that things were slower.   How could that possibly be?   Mac OS X users specifically were saying that data transfers of images for Photoshop and proofing were taking 10-20x longer than they were prior to the migration.   We tested the issue on Windows based machines that the admin team uses on a daily basis and didn’t see the issue.   Cluster was hardly working so it wasn’t a performance issue, etc.

Through troubleshooting, it turns out the issue was specific to OS X – any version of 10.X saw the same issue, so it wasn’t version specific.   NFS would get the performance back, but we have yet to find the information we need to implement user authentication and locking to dedicated IP’s won’t work for us. 

After a bit of Google searching, I decided I’d reach out to twitter.   My use of twitter is really for technical education, business networking, and a few jokes here and there – so my friends list is primarily technical.

So I send out a little tweet:
image

@vTexan an EMC vSpecialist that follows me then re-tweeted the request for help and CC’d the Isilon Cavalry:
image

@peglarr the CTO-Americas of Isilion that vTexan CC’d chimes in that night:
image

He also Direct Messaged me the a link to the article on Slow Samba file copying Speeds in Mac OS X he referenced for fixing the issue.

Essentially, if you are having a problem with your Mac OS X Clients having slow file copy on a Samba (SMB) share on an Isilon cluster:

Open a terminal session and:
sudo vi /etc/sysctl.conf

Then add the following line to the /etc/sysctl.conf
net.inet.tcp.delayed_ack=0

Even while copying an existing set of 4GB worth of files on a machine, we could make the above change and start the same copy a second time, and the latter would finish before the first copy was even close to being done.

Problem Solved!  Thanks to Social Media, and specifically the power of Twitter, we were able to solve a problem that was affecting 30+ users after a migration that was supposed to make their lives easier.   I may not have chosen the appropriate keywords in my Google search, but thankfully people out there in the twitterverse already had.