This removes extraneous double-quotes from comma-delimited text files where double-quotes are supposed to be the text delimiter.
Say you have a record like this, where someone used a double-quote within
a string:
"FIELD1","Field 2 "is cool"","FIELD3"Using this will change this to:
"FIELD1","Field 2 'is cool'","FIELD3"
First, slap the moron who used double-quotes within a data field.
Download THIS ZIP FILE
Unzip to a convenient location
Open DOS window and execute:
sed -f cleandq file1.txt > file2.txtcleandq = included script file
If this doesn't work, use the following command to dump out a file after
the replacement commands have been executed, but before the remaining
double-quotes have been switched to single-quotes. Open the output file
in a text editor and search for double-quotes. Once found, the cleandq
file may need to be modified to catch the double-quotes that found their
way through:
sed -f debug file1.txt > file2.txt
The double-quotes remaining in the output file will be changed to single-quotes by the cleandq script. Look closely for anomalies here.
This tar's and gzip's any files meeting the criteria in the specified subdirectories. Specifically, I have a base directory which contains several subdirectories. But, not all subdirectories shall be archived, so I have the names of those I want archived in a separate text file. The only files in the subdirectories that I want archived are those for the previous month. I've set it up such that these files are named something like LOG_09052003.dat, LOG_09062003.dat, etc. So, this gawk script goes through each subdirectory specified in the text file, creates a tarball out of those that match the regex *PM??YYYY*, where PM is the previous month - i.e., 09, 10, etc, and YYYY is the current year. Then it gzip's the tarball, and if all's okay, deletes the original LOG files.
sed -e '/[0-9]SAMPLE/d' < mri.txt >mri3.txt
deletes lines containing any #, the word SAMPLE from mri.txt writes to mri3.txtsed 's/^.*MSH/MSH/' file.txt > file2.txt
removes anything before MSH on a line (replaces beginning of line through MSH with just MSH)sed -n '/SMITH/p' file.txt > file2.txt
print lines containing SMITH in file.txt to file2.txtsed 's/\"//g' file1.txt>file2.txt
remove all " from a filesed -e '/PID.\{15\}V/d' < FILE1.dat > FILE2.dat
Removes all lines containing PID, 15 characters, V, from FILE1.dat into FILE2.datgrep 'SAMPLE \{19\}CMC' mpi.txt
finds SAMPLE, 19 spaces, CMC.ls|grep '\.[0-9]'
finds files with .numbergrep 'Text string:' text.txt | sed 's/^.*:/:/' >text2.txt
get rid of 'Text string:' at the beginning of all lines in text.txt, output to text2.txtsed 's/^.*[^0-9]//' trans.txt>p2.txt
gets rid of all but nbrsnetstat -a |grep $SERVER_PORT|grep '*-*'|wc -l
Special command, counts certain types of active socket connectionsawk -F'|' '{ print $4 }' file1.txt > file2.txt
take the 4th element of pipe delimited file1.txt, to file2.txtgrep -c '\|K[A-Z]\{3,\}\^DAV' file.txt
count lines containing |Kthen ^DAV grep -c 'PID.\{15\}M[0-9]\{9\}\|.*\|V[0-9]\{10\}\|' test.dat
count lines with a M unit number (PID-3), V account # (PID-18)grep -c 'PID.\{15\}V[0-9]\{9\}\|.*\|[MJ][0-9]\{10\}\|' test.dat
count lines with a V unit number (PID-3), M or J account # (PID-18)Note, several of these are applicable to HL7 messages, but might serve as examples for other purposes.