[wplug] awk help [RESOLVED]

chris.romano at verizon.net chris.romano at verizon.net
Fri Aug 27 21:12:30 EDT 2004


I baically used this script to change all ","s that are not in quotes to ":".  From there I will use that awk script to pull out the fields that I want.  I haven't tested the awk script because it is going to take a few minutes or so to run the perl script.  The file is 200MB+ and over 618,000 records.  If anyone is interested in the exact script let me know.  I might just post it tomorrow after I am finished just incase anyone every needs it.

Thanks to everyone that helped.

Chris Romano 
> 
> From: Gary Morrow <gary.morrow at ansys.com>
> Date: 2004/08/27 Fri PM 06:07:38 EDT
> To: General user list <wplug at wplug.org>
> Subject: Re: [wplug] awk help
> 
> <obligatory Larry Wall quote>
> : I've tried (in vi) "g/[a-z]\n[a-z]/s//_/"...but that doesn't
> : cut it. Any ideas? (I take it that it may be a two-pass sort of solution).
> In the first pass, install perl. :-)
> 
>                         --- Larry Wall <6849 at jpl-devvax.JPL.NASA.GOV> 
> </obligatory Larry Wall quote>
> 
> I would just go ahead and learn a little Perl:
> 
> #!/usr/bin/perl
> 
> open(FILE,"$ARGV[0]") || die "Could not read $ARGV[0]";
> 
> while ( <FILE> ) {
> 
> 	@line = split(/\"/,$_);    # first split the line on quotes
> 	$quote = 0;                # set a toggle for whether we're in a quote section or not
> 
> 	foreach $qpart ( @line ) {
> 
> 		if ( $quote ) {
> 			# if we're in a quote section just print it
> 			print "\"$qpart\"";
> 			$quote = 0;
> 		} else {
> 			# if we're not, split on commas and print the parts
> 			@parts = split(/\,/,$qpart);
> 			foreach $part ( @parts ) {
> 				print "$part\n";
> 			}
> 			$quote = 1;
> 		}
> 
> 	}
> 
> }
> 
> close FILE;
> 
> 
>  >cat xxx.dat
> "first, field test",some,text,"is, in",here
> 
>  >perl ptest xxx.dat
> "first, field test"
> some
> text
> "is, in"
> here
>  
> It may get a little wierd if you start a quoted section and never close it, but other than
> that it should work.
> 
> Gary
> 
> chris.romano at verizon.net wrote:
> >>From: duncanhutty at comcast.net
> >>Date: 2004/08/27 Fri PM 04:48:01 EDT
> >>To: General user list <wplug at wplug.org>
> >>Subject: [SPAM] Re: [wplug] awk help
> >>
> >>How about:
> >>First apply a sed transformation to say that any instance of a comma *inside quotes* were converted to a (pick a suitable escape sequence) and then send it through your awk and then convert all instances of that prior escape sequence back to a commas again.
> >>1st sed command:
> >>/".*"/s/,/\\,/g
> >>
> >>Duncan Hutty
> >>-------------- Original message -------------- 
> >>
> >>
> >>>I am having some issues with parsing a file using awk. I am not sure if awk is 
> >>>the best tool, but I do not know perl at all. I have a comma delimited file 
> >>>that has a few fields quoted. In those quoted fields there are commas. So when 
> >>>awk process those fields it thinks that they are seperate and not one string. 
> >>>i.e. 
> >>>
> >>>some,text,"is, in",here 
> >>>
> >>>
> >>>awk spits out: 
> >>>some 
> >>>text 
> >>>"is 
> >>>in" 
> >>>here 
> >>>
> >>>I want it to be 
> >>>some 
> >>>text 
> >>>is, in 
> >>>here 
> >>>
> >>>my awk command is: 
> >>>`cat $1 | awk 'BEGIN{FS=","}{print 
> >>>$1","$3",\""$5"\",\""$6"\","$7","$8","$9","$46","$49}' > output.trans` 
> > 
> > 
> > 
> > That did not work.  Unless I am reading that wrong.
> > 
> > sed /".*"/s/,/\\,/g filename > output
> > tried ...
> > sed /".*"/s/,/\:/g filename > output (changes all "," to ":")
> > sed /".*"/s/,/\\:/g filename > output (changes all "," to ":")
> > I will try to mess with sed though.  That is a good idea.
> > 
> > Thanks,
> > Chris
> > 
> > _______________________________________________
> > wplug mailing list
> > wplug at wplug.org
> > http://www.wplug.org/mailman/listinfo/wplug
> > 
> 
> -- 
> Gary Morrow
> UNIX Systems Engineering and CASE Tools.
> Ansys Inc.
> gary.morrow at ansys.com
> Phone: 724-514-2978
> Fax: 724.514.3117
> -------------------------------------------------------------
> The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
> 
> 
> 
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
> 




More information about the wplug mailing list