Delete all lines before first occurrence of specific string in file

Basically I have a file like:

junk
morejunk
somestring
bats
car
somestring
bats
car
somestring
bats
car

and I want to remove all of the junk before the first occurrence of somestring so the file looks like

somestring
bats
car
somestring
bats
car
somestring
bats
car

I followed the advice from this question to use sed -i '0,/somestring/,d' file.txt but it deletes the line with the first occurrence of somestring, when I want to keep that line as the first line.

Answers


With sed you could use:

sed -i '/somestring/,$!d' file

Explanation of replace expressions:

, matches lines starting from where the first address matches, and continues until the second match (inclusively).

$ matches the last line of the last file of input, or the last line of each file when the -i or -s options are specified.

! If the character follows an address range, then only lines which do not match the address range will be selected.

d Delete the pattern space; immediately start next cycle.

Result:

$ sed -i '/somestring/,$!d' file
somestring
bats
car
somestring
bats
car
somestring
bats
car

$ sed -n '/somestring/,$p' infile
somestring
bats
car
somestring
bats
car
somestring
bats
car

The command suppresses printing with -n, and then for the address range /somestring/,$, i.e., from somestring to the last line, executes the p command to print the line.


Here's a way you can do it using awk:

awk '/somestring/ { f = 1 } f' file

When the pattern matches, set f to true. When f becomes true, print each line.

Another option, slightly more cryptic:

awk 'f += /somestring/' file

f is increased by either 1 when the pattern matches or 0 when it doesn't. Once a line has matched the pattern, the expression becomes true, so each line is printed.


another idiomatic awk solution (and fewest keystrokes) is

$ awk '/somestring/,0' file   

somestring
bats
car
somestring
bats
car
somestring
bats
car

Concatenate with Echo and GNU Sed

You had most of the solution with GNU sed, which allows you to use both line numbers and regular expressions in range patterns. All you really need to do to get the behavior you want is to prepend the string you're using as your end-pattern to the resulting output.

For example:

$ str='somestring'; echo -e "${str}\n$(sed "0,/${str}/d" /tmp/corpus)"
somestring
bats
car
somestring
bats
car
somestring
bats
car

Basically, you assign the pattern to str, which you then reuse in both the echo statement and the sed expression. If you run into quoting problems related to variable interpolation, just replace the str variable with fixed strings in both your echo and sed commands. However, works as-is with the posted corpus.


Need Your Help

Strange issue using Git with Visual Studio 2015 and VS Team Services

git visual-studio-2015 azure-devops

I am using Visual Studio 2015 and Visual Studio Team Services as a remote repository. The problem I've encountered is that a file was marked ignored in my local repo, and didn't push to the remote....

Sencha Touch - Cross Domain (CORS) Issue

javascript json apache sencha-touch cors

I am attempting to load datapoints from a cross domain URL. Unfortunately I can't get this to work. I have set up my Apache Linux server to support CORS as far as I can tell. I added the following ...