December 2009
M T W T F S S
« Nov    
 123456
78910111213
14151617181920
21222324252627
28293031  
174

Categories

Archives

[one-liner]: Splitting & Joining Large Files

Recently at work I had an issue come up where I needed to ftp a multi-gigabyte file up to a vendors site and they had mentioned that their ftp server had experienced problems with receiving such large files. So they recommended that I split the file up into smaller chunks.

Splitting

Here is an example of how I used the UNIX command split to chop the file up. For this example, NOTE: I’m going to use a file that’s ~347MB.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# file to be uploaded
% ls -l | grep exp_db
-rw-r--r--   1 oracle   dba      347181617 Jun 25 15:06 exp_db.dmp.gz
 
% split --help
Usage: split [-l #] [-a #] [file [name]]
       split [-b #[k|m]] [-a #] [file [name]]
       split [-#] [-a #] [file [name]]
 
# split file exp_db.dmp.gz
% split -b 100m exp_db.dmp.gz segment_
ls -l | grep seg
-rw-r--r--   1 dbapps   staff    104857600 Jun 30 10:14 segment_aa
-rw-r--r--   1 dbapps   staff    104857600 Jun 30 10:14 segment_ab
-rw-r--r--   1 dbapps   staff    104857600 Jun 30 10:15 segment_ac
-rw-r--r--   1 dbapps   staff    32608817 Jun 30 10:15 segment_ad

Here’s the basic form of the split command:

1
2
3
4
5
6
7
8
9
10
11
12
split [ -b  n | nk | nm] [-a suffixlength] [ file [name]]
 
# -b switch, size of output files where:
# n  = bytes
# nk = 1*1024 bytes
# nm = 1*1048576 bytes
 
# -a switch: the number of letters to use when creating a "chunks" suffix
# (defaults to 2, e.g. aa, ab, ac, ... )
 
# file: the file you want to split
# name: the prefix portion of the name of a "chunk" of the split up file

Joining

And here is the command to join the segment_* files back into the original.

1
2
# example join command
% cat segment_* > exp_db.dmp.gz

NOTE: For further details regarding my one-liner blog posts, check out my one-liner style guide primer.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>