[wplug] Regex help?

Lawrence Wolfson oclarry44 at yahoo.com
Mon Apr 27 20:10:08 EDT 2015


Excel functions will also work and makes sense for a one time task. 
Assuming that your column of data is in column A and starts at the top of the worksheet (cell A1) and assuming that the first space is always after the entire chapter/section/subsection/paragraph number, then do some variation of the following:
(Basically, this is to use a formula to get the first part in column A, a formula to get the second part in column B, then copy the values from column A to column C and column B to column D to change the formulas to values, then delete columns A and B.)
1. Insert 4 columns in front of data so that columns A thru D are empty and the data is in column E2. In cell A1, type "=LEFT(E1,FIND(" ",E1)-1)" (without the outside quotes).3. In cell B1, type "=RIGHT(E1,LEN(E1)-FIND(" ",E1))" (without the outside quotes).4. Copy cells A1 and B1 down the extent of the data.5.Highlight columns A and B down to the end of the data, copy and then "paste value" in columns C and D.6 Delete columns A and B.

      From: Doug Green <diego96 at mac.com>
 To: General user list <wplug at wplug.org> 
 Sent: Monday, April 27, 2015 3:01 PM
 Subject: [wplug] Regex help?
   
Hi all,
I've got some work data that I'm trying to clean up. It's a single column of Excel data, each row contains a chapter/section/subsection/paragraph number followed by a string of text. I'd like to separate the heading numbers into a different column than the body of the text (creating a two column data set). 

My plan was to do a CSV export, then add a comma after the number. By re-importing the CSV this <should> put the section number in a different column than the text. Example:

7.1.3.2. Section describing general attributes of an item. 

Converted to: 

7.1.3.2., Section describing...

Emacs has a convenient replace-regex function that simply asks first what regex you want to search for and next what you want to replace it with. I've tried every combo of "[0-9]\." and "[:digit:]\." but I'm not matching ANY results. Can anyone point me in the right direction for a little help on writing a generalized regex that will match "any number followed by a period, optionally followed by up to 3 more numbers each followed by periods"? Clear as mud, right? 

Thanks!
_______________________________________________
wplug mailing list
wplug at wplug.org
http://www.wplug.org/mailman/listinfo/wplug


   


More information about the wplug mailing list