Go Back   Technology Questions > Software Questions > Operating System Questions > Linux

Reply
 
LinkBack Thread Tools
  #16 (permalink)  
Old 07-28-2007, 10:10 AM
Linonut
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

After takin' a swig o' grog, EOS belched out this bit o' wisdom:

> rcrios******.com wrote:
>
>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
>> 10 hours and we weren't even close to finish it.

>
> it has to be a *.tar?
> if not mirdir perhaps
> http://sourceforge.net/projects/mirdir


How does it work compared to rsync?

--
Tux rox!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

 
Old 07-28-2007, 10:10 AM
Xploder HD Movie Player for PS3. Manage, convert and transfer media files between the PC and PS3.
  #17 (permalink)  
Old 07-28-2007, 01:50 PM
Roy
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com>
wrote:

>rcrios******.com wrote:
>> Hi,
>>
>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
>> SAS 2.5" 10K rpm.
>>
>> We are trying to do a backup of a directory which has more or less
>> 10.000.000 of xml files. The files size varies between 1K and 10K.
>>
>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
>> 10 hours and we weren't even close to finish it.
>>
>> So, my question is: how to do a backup of a huge amount of tiny files?
>>

>It is never a good idea to have this many files in a single directory!!
>I would split the files over 1000 and then use tar on it.



This is interesting. What is the recommended maximum number of files per
directory nowadays? Back in 1988 I worked for a short time using SCO Unix
and remember reading somewhere that you should try to limit the number of
files per directiory to 8! Of course processors and drives are a little
faster and bigger today. :)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #18 (permalink)  
Old 07-28-2007, 09:00 PM
Michael C.
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On Thu, 26 Jul 2007 06:44:00 -0700,
rcrios******.com <rcrios******.com> wrote:
> Hi,
>
> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
> SAS 2.5" 10K rpm.
>
> We are trying to do a backup of a directory which has more or less
> 10.000.000 of xml files. The files size varies between 1K and 10K.
>
> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more
> than 10 hours and we weren't even close to finish it.
>
> So, my question is: how to do a backup of a huge amount of tiny
> files?


I would use:

tar cf backup.tar /hugedir
gzip backup.tar

I like to let tar do the compression too, but I wouldn't for that many
files. Going this route tar should be reasonably quick, and you can
do the compression in the background.

JMTC,

Michael C.
--
mjchappell@verizon.net http://mcsuper5.freeshell.org/

They grumble the most who see the show on free passes.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #19 (permalink)  
Old 07-28-2007, 10:30 PM
EOS
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

Linonut wrote:

> How does it work compared to rsync?


never used rsync :-(
i must test rysnc also, but no time......
--
EOS
www.photo-memories.be
Running KDE 3.5.7 / openSUSE 10.2
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #20 (permalink)  
Old 07-30-2007, 07:50 AM
birre
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On 2007-07-27 13:45, flupp wrote:
> Not that I am too familiar with this kind of stuff, but wouldn't dd be
> able to come to rescue in such cases ?
>
> Kind regards,
>
> flupp
>


no , what make you think dd can prune a directory ?

/bb
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #21 (permalink)  
Old 07-30-2007, 08:10 AM
birre
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On 2007-07-28 23:34, Roy wrote:
> On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com>
> wrote:
>
>> rcrios******.com wrote:
>>> Hi,
>>>
>>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
>>> SAS 2.5" 10K rpm.
>>>
>>> We are trying to do a backup of a directory which has more or less
>>> 10.000.000 of xml files. The files size varies between 1K and 10K.
>>>
>>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
>>> 10 hours and we weren't even close to finish it.
>>>
>>> So, my question is: how to do a backup of a huge amount of tiny files?
>>>

>> It is never a good idea to have this many files in a single directory!!
>> I would split the files over 1000 and then use tar on it.

>
>
> This is interesting. What is the recommended maximum number of files per
> directory nowadays? Back in 1988 I worked for a short time using SCO Unix
> and remember reading somewhere that you should try to limit the number of
> files per directiory to 8! Of course processors and drives are a little
> faster and bigger today. :)
>


8 was a common limits of mounted filesystems at that time, not the number of
files, you must remember wrong.

I'm sure I had many thousands of files in the same dir 1988, and the only
problem I can remember was the limits of MS-DOS, not on my UNIX machines.

/bb
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #22 (permalink)  
Old 07-30-2007, 08:20 AM
birre
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On 2007-07-29 06:49, Michael C. wrote:
> On Thu, 26 Jul 2007 06:44:00 -0700,
> rcrios******.com <rcrios******.com> wrote:
>> Hi,
>>
>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
>> SAS 2.5" 10K rpm.
>>
>> We are trying to do a backup of a directory which has more or less
>> 10.000.000 of xml files. The files size varies between 1K and 10K.
>>
>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more
>> than 10 hours and we weren't even close to finish it.
>>
>> So, my question is: how to do a backup of a huge amount of tiny
>> files?

>
> I would use:
>
> tar cf backup.tar /hugedir
> gzip backup.tar
>
> I like to let tar do the compression too, but I wouldn't for that many
> files. Going this route tar should be reasonably quick, and you can
> do the compression in the background.
>
> JMTC,
>
> Michael C.


A directory with 10 millions xml files is more a design problem, where someone
forgot to think before starting to make the software.

Using tar will force every file to be read every time, and will waste resources
even more.

I know nothing about the application, but it may be possible that a database
solution (mysql for example) should work much better.

/bb
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #23 (permalink)  
Old 07-31-2007, 12:40 AM
Roy
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On Mon, 30 Jul 2007 17:57:39 +0200, birre <spamtrap@norsborg.net> wrote:

>On 2007-07-28 23:34, Roy wrote:
>> On Thu, 26 Jul 2007 22:15:26 +0200, Dawid Michalczyk <dm@eonworks.com>
>> wrote:
>>
>>> rcrios******.com wrote:
>>>> Hi,
>>>>
>>>> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
>>>> SAS 2.5" 10K rpm.
>>>>
>>>> We are trying to do a backup of a directory which has more or less
>>>> 10.000.000 of xml files. The files size varies between 1K and 10K.
>>>>
>>>> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
>>>> 10 hours and we weren't even close to finish it.
>>>>
>>>> So, my question is: how to do a backup of a huge amount of tiny files?
>>>>
>>> It is never a good idea to have this many files in a single directory!!
>>> I would split the files over 1000 and then use tar on it.

>>
>>
>> This is interesting. What is the recommended maximum number of files per
>> directory nowadays? Back in 1988 I worked for a short time using SCO Unix
>> and remember reading somewhere that you should try to limit the number of
>> files per directiory to 8! Of course processors and drives are a little
>> faster and bigger today. :)
>>

>
>8 was a common limits of mounted filesystems at that time, not the number of
>files, you must remember wrong.


Could be. My memory's real good but it's short.
>
>I'm sure I had many thousands of files in the same dir 1988, and the only
>problem I can remember was the limits of MS-DOS, not on my UNIX machines.
>
>/bb





Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #24 (permalink)  
Old 07-31-2007, 04:20 PM
x0054
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

rcrios******.com wrote in news:1185457440.739319.113700
@d30g2000prg.googlegroups.com:

> Hi,
>
> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
> SAS 2.5" 10K rpm.
>
> We are trying to do a backup of a directory which has more or less
> 10.000.000 of xml files. The files size varies between 1K and 10K.
>
> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
> 10 hours and we weren't even close to finish it.
>
> So, my question is: how to do a backup of a huge amount of tiny files?
>
> TIA,
>
> Bob
>


Well, wow, that's a stupid ****ing design! Who thought that was a good
idea. Any way, the best solution, in this case is runing a combination
of a RAID1 and a Log File System (LFS). Few projects have been working
on a LFS system for linux, look around. LFS will give you backup, and
RAID will give you redundancy, thus you get the combination every good
archive system should have.

Next, setup a directory monitor on that directory and look for file
access. I did this once but forgot which program I used. I am sure some
one will recommend a good one. Dump the names of all changed files into
a text file. That way, once the time for archiving comes, you can simply
copy the files that have actually changed, rather then all of them, and
at the same time you do not need to do a costly file change search.

Or, if you got the space for it, just "dd" an image of the hard drive on
which the directory is located and then compress the image. That's
probably the most simple and fastest solution, but it only really works
if you can move all the files on a separate drive. If you run RAID1,
then you can also separate out one disk, do the exceedingly cumbersome
backup process on that disk and then have it be rebuilt by the raid.

Hope one of those ideas helps.

- Bogdan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #25 (permalink)  
Old 08-01-2007, 06:20 AM
rcrios@gmail.com
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

Hi,

Thanks for all the replies. Yes, I agree that the design is realy
poor, and it's even more sad that it's a turn key system. AND I'm not
talking about a small company...

I asked then how to do the backup, and they aren't able to answer
it...

So, what we'll try to do it, is backup it using the script that
someone posted earlier.

Well, thank you very much.



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #26 (permalink)  
Old 08-01-2007, 07:50 AM
birre
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On 2007-08-01 16:12, rcrios******.com wrote:
> Hi,
>
> Thanks for all the replies. Yes, I agree that the design is realy
> poor, and it's even more sad that it's a turn key system. AND I'm not
> talking about a small company...
>
> I asked then how to do the backup, and they aren't able to answer
> it...
>
> So, what we'll try to do it, is backup it using the script that
> someone posted earlier.
>
> Well, thank you very much.
>
>
>


Are you talking about a company that is NOT small, and have no
sysadmins that can design a computer system?

Backup/restore is the first on the list, and in the other order.

How critical is the downtime for the system ?
How long time do we have for recover?
Can any data transfers/updates be lost
What do we need to make it possible
How to make backup so it's possible
How to design the filesystems , and store the data.

Then design the system, install it, patch it, test it.
Make backup , and when everything works, take it in production.

Only foolish Microsoft admins install systems and save the
backup/restore problem until the last thing to do, when
tempfiles,data,logs,configs,applications are everywhere mixed up
in a mess on gigantic filesystems.

Admins that has build systems for years maybe not even think they
plan for restore/backup first, but they do by intuition.

If you think the backup takes long time, just wait for the day
the machine has crashed, and they will have it fixed _NOW_.

Will you be there then, waiting for 10 millions files to be read
from a tar archive after the painful restore of machine,OS and applications
when everyone already had it over their limits and are running in panic ?
(I bet that this machine can't even be recovered from bare metal)

Use a database for it, you can dump/restore that much faster,
and with proper indexes, I guess even access/update will be faster.

Sorry if I'm negative, but I have seen this problems many times before,
it's always start with a simple question, but the question itself say
"we just figured out a problem" .

And as I wrote before.
Once you fill up a directory with so many files, the directory entry
will not be smaller even if you delete all files.
So if you delete all but 10 of them, all seeks in the dir will still
be terrible slow until you do rmdir/mkdir and start from scratch.

That system should never been accepted as delivered without proper
backup/restore procedures , and a so many files in a directory.
It's made by beginners or MS drones that believes a system should be so
easy to admin that you don't need so much knowledge.

Rebuild the system from scratch before it cost a fortune :-)

/bb
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #27 (permalink)  
Old 08-01-2007, 12:00 PM
Rikishi 42
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

On 2007-08-01, rcrios******.com <rcrios******.com> wrote:

> Thanks for all the replies. Yes, I agree that the design is realy
> poor, and it's even more sad that it's a turn key system. AND I'm not
> talking about a small company...
>
> I asked then how to do the backup, and they aren't able to answer
> it...
>
> So, what we'll try to do it, is backup it using the script that
> someone posted earlier.


Let us know which one, I'm very curious.


--
There is an art, it says, or rather, a knack to flying.
The knack lies in learning how to throw yourself at the ground and miss.
Douglas Adams
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #28 (permalink)  
Old 08-04-2007, 05:20 AM
Ben Collver
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

rcrios******.com wrote:
> Hi,
>
> We have a RHEL 4 running on a dual Xeon (5110) 1.6Ghz -- 8Gb RAM --
> SAS 2.5" 10K rpm.
>
> We are trying to do a backup of a directory which has more or less
> 10.000.000 of xml files. The files size varies between 1K and 10K.
>
> When we tried to do a tar cvfz backup.tgz /hugedir, we spent more than
> 10 hours and we weren't even close to finish it.
>
> So, my question is: how to do a backup of a huge amount of tiny files?
>
> TIA,
>
> Bob
>


I am surprised that nobody suggested disk imaging. Suppose you have a file system dedicated to the xml files mounted on /hugedir from /dev/vg00/lvol1.

dd if=/dev/vg00/lvol1 bs=1024k | gzip >backup.dmg.gz

You could get a corrupt backup if something writes to the filesystem while you back it up. You could synch, remount read-only, backup, and remount read-write.

If I remember correctly, RHEL does not sport LVM snapshots. I've seen primitive filesystem snapshots implemented with RAID1. Shut down pertinent daemons, synch the fs, split the mirror, start up the daemons, and back up the non-live side of the mirror. Then join it back to the RAID1 set and let it rebuild.

Ben
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #29 (permalink)  
Old 08-04-2007, 07:41 AM
sk8r-365
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

Government satellites recorded Ben Collver saying:

> I am surprised that nobody suggested disk imaging. Suppose you have a file
> system dedicated to the xml files mounted on /hugedir from /dev/vg00/lvol1.
>

<snipped and word wrapped>

Ben, It's **** hard to read your post when it scrolls off the right
hand side of the screen continually. Please set your word wrap to, at
most, 80 characters; would be very helpful.

Thanks,
--
sk8r-365

http://goodbye-microsoft.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

  #30 (permalink)  
Old 08-05-2007, 07:40 AM
Ben Collver
Tablet PC Guest
 
Posts: n/a
Re: How to backup a huge amount of tiny files

sk8r-365 wrote:
> Ben, It's **** hard to read your post when it scrolls off the right
> hand side of the screen continually. Please set your word wrap to, at
> most, 80 characters; would be very helpful.


Sorry about that, I didn't realize I was sending long lines. I've
reconfigured Thunderbird to wrap the lines at 72 columns.

By default Thunderbird sends format=flowed. SLRN does not support this,
but you can read it more easily if you set your word wrap in SLRN.
http://www.faqs.org/rfcs/rfc3676.html
http://www.slrn.org/manual/slrn-manual-6.html#ss6.142

Here is a funny rant about hard line breaks.
http://xahlee.org/UnixResource_dir/w...cate_line.html

Ben
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
huge lag on renaming/moving files in Vista Fermanagh Windows Vista 9 04-18-2007 07:45 AM
Files backup versus Full backup Wieslaw Windows Vista 2 03-19-2007 04:30 AM
Vista backup doesn't backup all files (like for example PHP files) A Bertrand Windows Vista 0 02-26-2007 12:45 PM
Copying large folders only finds a tiny percentage of files Dale Windows Vista 0 02-25-2007 03:45 PM
RE: Backup files does not backup .EXEs Dale Windows Vista 0 01-02-2007 10:27 AM


All times are GMT -8. The time now is 11:23 AM.


2003 - 2008 All Rights Reserved. Technology Questions

SEO by vBSEO 3.1.0