Posts

Showing posts from 2012

Pool vs Raid group: Bad experience

With flare 30, pool is the new feature.and mixed pool is something totally new. You can combine multiple disk type in a common raid configuration and put it inside the pool. Enable auto-tiering and you are promised for a good performance by moving data across different tiers, the catch (its not real time, unlike VMAX). If you have totally unpredicted I/O then by the time it is decided to move the data above or below the tiers the I/O profile gets completely changed. In the next version of he flare, you can have disks in different raid type inside a single pool. What goes inside the pool, how the data is striped or written is something internal to EMC. The very idea is to give customers a very comfortable way of handling the storage leaving the pain to EMC. However, things are always not green as it looks. This works well when you have predictable I/O, you know what is going inside and when. What if the dynamics changes every now and then and you do not have control over the I/O ...

NFS version 4 and 3

Version 3 maintains persistent locks. When a file system is unshared and shared back the locks are not destroyed so the state remains the same as if nothing has changed. Version 4 removes any state once the file system is unshared. All the state information is removed and any attempt to access the file again once shared back results in I/O error. The state does not change if it was unshared for some share option changes. Pseudo file system name space is maintained in version 4. In version 3 clients must request for individual mount of the entire exported share. Version 4 client gets a seamless view of all the exported file system. Path traversal was not allowed in version 3. Version 4 supports volatile file handles apart from persistent file handle. Server generates a unique file handle to every file. The file handle will change if the file is deleted and replaced by another file however will remain the same if server reboots or file was renamed. In oracle version 4 client con...

Rsync for bulk transfer

If you have to transfer a huge number of files lets say 50000 over WAN as one time data copy. Each of the files are of 3-4 Mb. We have couple of options like tar (will need extra space; unless you pipe it through ssh), scp (will be sequential), or use rsync. What I have done here, is used the --files-from option of rsync which specifies which files to transfer. On the source, create the file list 50000 lines (huge number) with ls > /tmp/a, then split the files based on line count. The line count is the deciding factor, if you have DS3 or higher make it smaller. Smaller line count will determine the concurrency or the number of simultaneous sessions that can start. split -l 1000 /tmp/a This will create 50000/1000 number of files starting xaa (you can decide the starting prefix, this is default) in the present working directory assuming its /tmp. Once done use the below either as single commands or put them in a script and start your job. nohup /root/rsync/bin/rsync -avz -...

NFS common problems

Have you come across very silly problem while mounting NFS share. It just comes out of no where and refuses to go. Hours of efforts are wasted when we find the exact issue and seems like banging our head on the doors. I have summarized some of them, I hope you might be having your set of goodies as well. If you are getting an error like permission denied on the client server (90% of the issue falls here), then check in sequence what has to be done: 1. If the server has multiple IP address, it can no surprises. Do a traceroute from the server to the client and see which interface is being used to connect. Of course please check first whether the client is pinging or not. Once you have the interface detail, this IP will be used on the clients to map the network share. 2. See if you can resolve the client IP address using DNS, if you can then use dns name in the share access list on the NFS server. 3. If the client has multiple interface, then do a traceroute to the IP found in 1...

Friday night, almost there

It started on Friday evening, yes Friday for all but not for me. Waiting for the EMC engineer to do a webex and configure our Recoverpoint appliance. Normally, if you are planning for DR site (big project indeed) for the last 6 months; how much time will you give for the lifeline to work yes, data replication. 1 month would be a decent guess, we have to do in 2 days. Don't laugh, very serious. We had done the rack and stack and all the cabling beforehand. It started around 9 PM and went on till 1 AM. All set for the testing. We just cross checked whether every IP is reachable or not. Viola, the wan IP was not pinging. It was on a different VLAN and was not in the firewall rules. Called up firewall engineer, didn't picked up the call; another same result. Escalated to manager, his phone was busy. called up project manager, he was furious (had to be). Escalated to Director, Information security and he confirmed will do something. Never ending wait started, finally around 3...

Move LVM from multiple local and iscsi disk to SAN

We are setting up our DR site. For replication we are using EMC Recoverpoint appliance. The license what we have works for only SAN volumes on EMC arrays namely VNX in our case. The database and file shares which are on SAN have no problem however, the critical file shares that are on local storage or Dell iscsi storage have an issue. It sounds easy to move them to SAN and start replication but not so. Consider below: 1. The LVM housing the file shares on a Dell server contains PV from local disk partition (yes not complete disk but like /dev/sda10 on extended partition) and other full disk from iscsi storage /dev/sdq and /dev/sdi 2. Total size is ~400 GB with 100 GB from local disk partition (total local disk size is 300 GB) and 300 (200 + 100) from iscsi. I know its wrong but it was done long time ago and we were running out of space. 3. The dell server does not have FC connection. 4. The file system structure contains 32000 nested directory structure with approx 1.2 billion fi...

Can you create a DR site in 6 months

Somebody might have done it, don't know. Some one might think we are crazy and its impossible; but we are doing it. Please add its not one to one correspondence but 100 percent DR with state of the art technology at our DR site. I will just give you a glimpse of what we are doing and how. Our production is a mix bag of: 1. Dell servers for intel needs like Oracle databases (11g rac), MySql on Linux and MsSql on windows, total around 80 non virtualized servers. 2. Dell servers for other application like tomcat,jboss,iis,file srrvers etc which are virtualized, total 300 physical and 1200 virtual instance. 3. Roughly 50 percent are virtualized. 4. Dell servers which makes a 80 node citrix farm 5. Dialers and call recorders for 4000 users 6. We are using exchange with 8000 users 7. We are handling storage of 1.4 pb on EMC vnx and NS480 along with 100 TB on Dell MD3000i storage 8. 80 inhouse developed application running 24x7 9. Core databases running on Oracle M8000 and M5000's wit...