| |||
| Re: How to backup a huge amount of tiny files After takin' a swig o' grog, EOS belched out this bit o' wisdom: > rcrios******.com wrote: > >> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than >> 10 hours and we weren't even close to finish it. > > it has to be a *.tar? > if not mirdir perhaps > http://sourceforge.net/projects/mirdir How does it work compared to rsync? -- Tux rox! |
| |||
| Re: How to backup a huge amount of tiny files On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com> wrote: >rcrios******.com wrote: >> Hi, >> >> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- >> SAS 2.5" 10K rpm. >> >> We are trying to do a backup of a directory which has more or less >> 10.000.000 of xml files. The files size varies between 1K and 10K. >> >> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than >> 10 hours and we weren't even close to finish it. >> >> So, my question is: how to do a backup of a huge amount of tiny files? >> >It is never a good idea to have this many files in a single directory!! >I would split the files over 1000 and then use tar on it. This is interesting. What is the recommended maximum number of files per directory nowadays? Back in 1988 I worked for a short time using SCO Unix and remember reading somewhere that you should try to limit the number of files per directiory to 8! Of course processors and drives are a little faster and bigger today. :) |
| |||
| Re: How to backup a huge amount of tiny files On Thu, 26 Jul 2007 06:44:00 -0700, rcrios******.com <rcrios******.com> wrote: > Hi, > > We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- > SAS 2.5" 10K rpm. > > We are trying to do a backup of a directory which has more or less > 10.000.000 of xml files. The files size varies between 1K and 10K. > > When we tried to do a tar cvfz backup.tgz /hugedir, we spent more > than 10 hours and we weren't even close to finish it. > > So, my question is: how to do a backup of a huge amount of tiny > files? I would use: tar cf backup.tar /hugedir gzip backup.tar I like to let tar do the compression too, but I wouldn't for that many files. Going this route tar should be reasonably quick, and you can do the compression in the background. JMTC, Michael C. -- mjchappell@verizon.net http://mcsuper5.freeshell.org/ They grumble the most who see the show on free passes. |
| |||
| Re: How to backup a huge amount of tiny files Linonut wrote: > How does it work compared to rsync? never used rsync :-( i must test rysnc also, but no time...... -- EOS www.photo-memories.be Running KDE 3.5.7 / openSUSE 10.2 |
| |||
| Re: How to backup a huge amount of tiny files On 2007-07-27 13:45, flupp wrote: > Not that I am too familiar with this kind of stuff, but wouldn't dd be > able to come to rescue in such cases ? > > Kind regards, > > flupp > no , what make you think dd can prune a directory ? /bb |
| |||
| Re: How to backup a huge amount of tiny files On 2007-07-28 23:34, Roy wrote: > On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com> > wrote: > >> rcrios******.com wrote: >>> Hi, >>> >>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- >>> SAS 2.5" 10K rpm. >>> >>> We are trying to do a backup of a directory which has more or less >>> 10.000.000 of xml files. The files size varies between 1K and 10K. >>> >>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than >>> 10 hours and we weren't even close to finish it. >>> >>> So, my question is: how to do a backup of a huge amount of tiny files? >>> >> It is never a good idea to have this many files in a single directory!! >> I would split the files over 1000 and then use tar on it. > > > This is interesting. What is the recommended maximum number of files per > directory nowadays? Back in 1988 I worked for a short time using SCO Unix > and remember reading somewhere that you should try to limit the number of > files per directiory to 8! Of course processors and drives are a little > faster and bigger today. :) > 8 was a common limits of mounted filesystems at that time, not the number of files, you must remember wrong. I'm sure I had many thousands of files in the same dir 1988, and the only problem I can remember was the limits of MS-DOS, not on my UNIX machines. /bb |
| |||
| Re: How to backup a huge amount of tiny files On 2007-07-29 06:49, Michael C. wrote: > On Thu, 26 Jul 2007 06:44:00 -0700, > rcrios******.com <rcrios******.com> wrote: >> Hi, >> >> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- >> SAS 2.5" 10K rpm. >> >> We are trying to do a backup of a directory which has more or less >> 10.000.000 of xml files. The files size varies between 1K and 10K. >> >> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more >> than 10 hours and we weren't even close to finish it. >> >> So, my question is: how to do a backup of a huge amount of tiny >> files? > > I would use: > > tar cf backup.tar /hugedir > gzip backup.tar > > I like to let tar do the compression too, but I wouldn't for that many > files. Going this route tar should be reasonably quick, and you can > do the compression in the background. > > JMTC, > > Michael C. A directory with 10 millions xml files is more a design problem, where someone forgot to think before starting to make the software. Using tar will force every file to be read every time, and will waste resources even more. I know nothing about the application, but it may be possible that a database solution (mysql for example) should work much better. /bb |
| |||
| Re: How to backup a huge amount of tiny files On Mon, 30 Jul 2007 17:57:39 +0200, birre <spamtrap@norsborg.net> wrote: >On 2007-07-28 23:34, Roy wrote: >> On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com> >> wrote: >> >>> rcrios******.com wrote: >>>> Hi, >>>> >>>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- >>>> SAS 2.5" 10K rpm. >>>> >>>> We are trying to do a backup of a directory which has more or less >>>> 10.000.000 of xml files. The files size varies between 1K and 10K. >>>> >>>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than >>>> 10 hours and we weren't even close to finish it. >>>> >>>> So, my question is: how to do a backup of a huge amount of tiny files? >>>> >>> It is never a good idea to have this many files in a single directory!! >>> I would split the files over 1000 and then use tar on it. >> >> >> This is interesting. What is the recommended maximum number of files per >> directory nowadays? Back in 1988 I worked for a short time using SCO Unix >> and remember reading somewhere that you should try to limit the number of >> files per directiory to 8! Of course processors and drives are a little >> faster and bigger today. :) >> > >8 was a common limits of mounted filesystems at that time, not the number of >files, you must remember wrong. Could be. My memory's real good but it's short. > >I'm sure I had many thousands of files in the same dir 1988, and the only >problem I can remember was the limits of MS-DOS, not on my UNIX machines. > >/bb |
| |||
| Re: How to backup a huge amount of tiny files rcrios******.com wrote in news:1185457440.739319.113700 @d30g2000prg.googlegroups.com: > Hi, > > We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- > SAS 2.5" 10K rpm. > > We are trying to do a backup of a directory which has more or less > 10.000.000 of xml files. The files size varies between 1K and 10K. > > When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than > 10 hours and we weren't even close to finish it. > > So, my question is: how to do a backup of a huge amount of tiny files? > > TIA, > > Bob > Well, wow, that's a stupid ****ing design! Who thought that was a good idea. Any way, the best solution, in this case is runing a combination of a RAID1 and a Log File System (LFS). Few projects have been working on a LFS system for linux, look around. LFS will give you backup, and RAID will give you redundancy, thus you get the combination every good archive system should have. Next, setup a directory monitor on that directory and look for file access. I did this once but forgot which program I used. I am sure some one will recommend a good one. Dump the names of all changed files into a text file. That way, once the time for archiving comes, you can simply copy the files that have actually changed, rather then all of them, and at the same time you do not need to do a costly file change search. Or, if you got the space for it, just "dd" an image of the hard drive on which the directory is located and then compress the image. That's probably the most simple and fastest solution, but it only really works if you can move all the files on a separate drive. If you run RAID1, then you can also separate out one disk, do the exceedingly cumbersome backup process on that disk and then have it be rebuilt by the raid. Hope one of those ideas helps. - Bogdan |
| |||
| Re: How to backup a huge amount of tiny files Hi, Thanks for all the replies. Yes, I agree that the design is realy poor, and it's even more sad that it's a turn key system. AND I'm not talking about a small company... I asked then how to do the backup, and they aren't able to answer it... So, what we'll try to do it, is backup it using the script that someone posted earlier. Well, thank you very much. |
| |||
| Re: How to backup a huge amount of tiny files On 2007-08-01 16:12, rcrios******.com wrote: > Hi, > > Thanks for all the replies. Yes, I agree that the design is realy > poor, and it's even more sad that it's a turn key system. AND I'm not > talking about a small company... > > I asked then how to do the backup, and they aren't able to answer > it... > > So, what we'll try to do it, is backup it using the script that > someone posted earlier. > > Well, thank you very much. > > > Are you talking about a company that is NOT small, and have no sysadmins that can design a computer system? Backup/restore is the first on the list, and in the other order. How critical is the downtime for the system ? How long time do we have for recover? Can any data transfers/updates be lost What do we need to make it possible How to make backup so it's possible How to design the filesystems , and store the data. Then design the system, install it, patch it, test it. Make backup , and when everything works, take it in production. Only foolish Microsoft admins install systems and save the backup/restore problem until the last thing to do, when tempfiles,data,logs,configs,applications are everywhere mixed up in a mess on gigantic filesystems. Admins that has build systems for years maybe not even think they plan for restore/backup first, but they do by intuition. If you think the backup takes long time, just wait for the day the machine has crashed, and they will have it fixed _NOW_. Will you be there then, waiting for 10 millions files to be read from a tar archive after the painful restore of machine,OS and applications when everyone already had it over their limits and are running in panic ? (I bet that this machine can't even be recovered from bare metal) Use a database for it, you can dump/restore that much faster, and with proper indexes, I guess even access/update will be faster. Sorry if I'm negative, but I have seen this problems many times before, it's always start with a simple question, but the question itself say "we just figured out a problem" . And as I wrote before. Once you fill up a directory with so many files, the directory entry will not be smaller even if you delete all files. So if you delete all but 10 of them, all seeks in the dir will still be terrible slow until you do rmdir/mkdir and start from scratch. That system should never been accepted as delivered without proper backup/restore procedures , and a so many files in a directory. It's made by beginners or MS drones that believes a system should be so easy to admin that you don't need so much knowledge. Rebuild the system from scratch before it cost a fortune :-) /bb |
| |||
| Re: How to backup a huge amount of tiny files On 2007-08-01, rcrios******.com <rcrios******.com> wrote: > Thanks for all the replies. Yes, I agree that the design is realy > poor, and it's even more sad that it's a turn key system. AND I'm not > talking about a small company... > > I asked then how to do the backup, and they aren't able to answer > it... > > So, what we'll try to do it, is backup it using the script that > someone posted earlier. Let us know which one, I'm very curious. -- There is an art, it says, or rather, a knack to flying. The knack lies in learning how to throw yourself at the ground and miss. Douglas Adams |
| |||
| Re: How to backup a huge amount of tiny files rcrios******.com wrote: > Hi, > > We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM -- > SAS 2.5" 10K rpm. > > We are trying to do a backup of a directory which has more or less > 10.000.000 of xml files. The files size varies between 1K and 10K. > > When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than > 10 hours and we weren't even close to finish it. > > So, my question is: how to do a backup of a huge amount of tiny files? > > TIA, > > Bob > I am surprised that nobody suggested disk imaging. Suppose you have a file system dedicated to the xml files mounted on /hugedir from /dev/vg00/lvol1. dd if=/dev/vg00/lvol1 bs=1024k | gzip >backup.dmg.gz You could get a corrupt backup if something writes to the filesystem while you back it up. You could synch, remount read-only, backup, and remount read-write. If I remember correctly, RHEL does not sport LVM snapshots. I've seen primitive filesystem snapshots implemented with RAID1. Shut down pertinent daemons, synch the fs, split the mirror, start up the daemons, and back up the non-live side of the mirror. Then join it back to the RAID1 set and let it rebuild. Ben |
| |||
| Re: How to backup a huge amount of tiny files Government satellites recorded Ben Collver saying: > I am surprised that nobody suggested disk imaging. Suppose you have a file > system dedicated to the xml files mounted on /hugedir from /dev/vg00/lvol1. > <snipped and word wrapped> Ben, It's **** hard to read your post when it scrolls off the right hand side of the screen continually. Please set your word wrap to, at most, 80 characters; would be very helpful. Thanks, -- sk8r-365 http://goodbye-microsoft.com/ |
| |||
| Re: How to backup a huge amount of tiny files sk8r-365 wrote: > Ben, It's **** hard to read your post when it scrolls off the right > hand side of the screen continually. Please set your word wrap to, at > most, 80 characters; would be very helpful. Sorry about that, I didn't realize I was sending long lines. I've reconfigured Thunderbird to wrap the lines at 72 columns. By default Thunderbird sends format=flowed. SLRN does not support this, but you can read it more easily if you set your word wrap in SLRN. http://www.faqs.org/rfcs/rfc3676.html http://www.slrn.org/manual/slrn-manual-6.html#ss6.142 Here is a funny rant about hard line breaks. http://xahlee.org/UnixResource_dir/w...cate_line.html Ben |
![]() |
| Bookmarks |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| huge lag on renaming/moving files in Vista | Fermanagh | Windows Vista | 9 | 04-18-2007 07:45 AM |
| Files backup versus Full backup | Wieslaw | Windows Vista | 2 | 03-19-2007 04:30 AM |
| Vista backup doesn't backup all files (like for example PHP files) | A Bertrand | Windows Vista | 0 | 02-26-2007 12:45 PM |
| Copying large folders only finds a tiny percentage of files | Dale | Windows Vista | 0 | 02-25-2007 03:45 PM |
| RE: Backup files does not backup .EXEs | Dale | Windows Vista | 0 | 01-02-2007 10:27 AM |