Free Scripts and Stuff

Text File Double-Quote Cleaner (dos/win32)


This removes extraneous double-quotes from comma-delimited text files where double-quotes are supposed to be the text delimiter.

Say you have a record like this, where someone used a double-quote within a string:

"FIELD1","Field 2 "is cool"","FIELD3"
Using this will change this to:
"FIELD1","Field 2 'is cool'","FIELD3"

Instructions

First, slap the moron who used double-quotes within a data field.
Download THIS ZIP FILE
Unzip to a convenient location
Open DOS window and execute:

sed -f cleandq file1.txt > file2.txt
cleandq = included script file
file1.txt = file you want to change
file2.txt = output file

If this doesn't work, use the following command to dump out a file after the replacement commands have been executed, but before the remaining double-quotes have been switched to single-quotes. Open the output file in a text editor and search for double-quotes. Once found, the cleandq file may need to be modified to catch the double-quotes that found their way through:

sed -f debug file1.txt > file2.txt

The double-quotes remaining in the output file will be changed to single-quotes by the cleandq script. Look closely for anomalies here.


archive.gawk

This is a nice little gawk script I wrote to tar and zip numerous text files for the previous month.
I have a batch file that runs this: gawk -f archive.gawk

This tar's and gzip's any files meeting the criteria in the specified subdirectories. Specifically, I have a base directory which contains several subdirectories. But, not all subdirectories shall be archived, so I have the names of those I want archived in a separate text file. The only files in the subdirectories that I want archived are those for the previous month. I've set it up such that these files are named something like LOG_09052003.dat, LOG_09062003.dat, etc. So, this gawk script goes through each subdirectory specified in the text file, creates a tarball out of those that match the regex *PM??YYYY*, where PM is the previous month - i.e., 09, 10, etc, and YYYY is the current year. Then it gzip's the tarball, and if all's okay, deletes the original LOG files.


PHP Ascii Table

View: ascii.php Source: ascii.php.txt
This is a little php-generated ascii table suitable for use as an Opera browser panel. If you'd like to use it, please copy it to your own damn server! Thanks!

Misc Commands

That's what she SED...

sed -e '/[0-9]SAMPLE/d' < mri.txt >mri3.txt
deletes lines containing any #, the word SAMPLE from mri.txt writes to mri3.txt

sed 's/^.*MSH/MSH/' file.txt > file2.txt
removes anything before MSH on a line (replaces beginning of line through MSH with just MSH)

sed -n '/SMITH/p' file.txt > file2.txt
print lines containing SMITH in file.txt to file2.txt

sed 's/\"//g' file1.txt>file2.txt
remove all " from a file

sed -e '/PID.\{15\}V/d' < FILE1.dat > FILE2.dat
Removes all lines containing PID, 15 characters, V, from FILE1.dat into FILE2.dat

grep 'SAMPLE \{19\}CMC' mpi.txt
finds SAMPLE, 19 spaces, CMC.

ls|grep '\.[0-9]'
finds files with .number

grep 'Text string:' text.txt | sed 's/^.*:/:/' >text2.txt
get rid of 'Text string:' at the beginning of all lines in text.txt, output to text2.txt

sed 's/^.*[^0-9]//' trans.txt>p2.txt
gets rid of all but nbrs

netstat -a |grep $SERVER_PORT|grep '*-*'|wc -l
Special command, counts certain types of active socket connections

awk -F'|' '{ print $4 }' file1.txt > file2.txt
take the 4th element of pipe delimited file1.txt, to file2.txt

grep -c '\|K[A-Z]\{3,\}\^DAV' file.txt
count lines containing |K then ^DAV

grep -c 'PID.\{15\}M[0-9]\{9\}\|.*\|V[0-9]\{10\}\|' test.dat
count lines with a M unit number (PID-3), V account # (PID-18)

grep -c 'PID.\{15\}V[0-9]\{9\}\|.*\|[MJ][0-9]\{10\}\|' test.dat
count lines with a V unit number (PID-3), M or J account # (PID-18)

Note, several of these are applicable to HL7 messages, but might serve as examples for other purposes.