[wplug] awk help

Gary Morrow gary.morrow at ansys.com
Fri Aug 27 18:07:38 EDT 2004


<obligatory Larry Wall quote>
: I've tried (in vi) "g/[a-z]\n[a-z]/s//_/"...but that doesn't
: cut it. Any ideas? (I take it that it may be a two-pass sort of solution).
In the first pass, install perl. :-)

                        --- Larry Wall <6849 at jpl-devvax.JPL.NASA.GOV> 
</obligatory Larry Wall quote>

I would just go ahead and learn a little Perl:

#!/usr/bin/perl

open(FILE,"$ARGV[0]") || die "Could not read $ARGV[0]";

while ( <FILE> ) {

	@line = split(/\"/,$_);    # first split the line on quotes
	$quote = 0;                # set a toggle for whether we're in a quote section or not

	foreach $qpart ( @line ) {

		if ( $quote ) {
			# if we're in a quote section just print it
			print "\"$qpart\"";
			$quote = 0;
		} else {
			# if we're not, split on commas and print the parts
			@parts = split(/\,/,$qpart);
			foreach $part ( @parts ) {
				print "$part\n";
			}
			$quote = 1;
		}

	}

}

close FILE;


 >cat xxx.dat
"first, field test",some,text,"is, in",here

 >perl ptest xxx.dat
"first, field test"
some
text
"is, in"
here
 
It may get a little wierd if you start a quoted section and never close it, but other than
that it should work.

Gary

chris.romano at verizon.net wrote:
>>From: duncanhutty at comcast.net
>>Date: 2004/08/27 Fri PM 04:48:01 EDT
>>To: General user list <wplug at wplug.org>
>>Subject: [SPAM] Re: [wplug] awk help
>>
>>How about:
>>First apply a sed transformation to say that any instance of a comma *inside quotes* were converted to a (pick a suitable escape sequence) and then send it through your awk and then convert all instances of that prior escape sequence back to a commas again.
>>1st sed command:
>>/".*"/s/,/\\,/g
>>
>>Duncan Hutty
>>-------------- Original message -------------- 
>>
>>
>>>I am having some issues with parsing a file using awk. I am not sure if awk is 
>>>the best tool, but I do not know perl at all. I have a comma delimited file 
>>>that has a few fields quoted. In those quoted fields there are commas. So when 
>>>awk process those fields it thinks that they are seperate and not one string. 
>>>i.e. 
>>>
>>>some,text,"is, in",here 
>>>
>>>
>>>awk spits out: 
>>>some 
>>>text 
>>>"is 
>>>in" 
>>>here 
>>>
>>>I want it to be 
>>>some 
>>>text 
>>>is, in 
>>>here 
>>>
>>>my awk command is: 
>>>`cat $1 | awk 'BEGIN{FS=","}{print 
>>>$1","$3",\""$5"\",\""$6"\","$7","$8","$9","$46","$49}' > output.trans` 
> 
> 
> 
> That did not work.  Unless I am reading that wrong.
> 
> sed /".*"/s/,/\\,/g filename > output
> tried ...
> sed /".*"/s/,/\:/g filename > output (changes all "," to ":")
> sed /".*"/s/,/\\:/g filename > output (changes all "," to ":")
> I will try to mess with sed though.  That is a good idea.
> 
> Thanks,
> Chris
> 
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
> 

-- 
Gary Morrow
UNIX Systems Engineering and CASE Tools.
Ansys Inc.
gary.morrow at ansys.com
Phone: 724-514-2978
Fax: 724.514.3117
-------------------------------------------------------------
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.






More information about the wplug mailing list