Question

I want to extract a substring from a string, using regex and sed/egrep. For example, I want to extract the title from this html tag:

<title>DuckDuckGo — Privacy, simplified.</title>

and of course, I don't want the tags themselfs.

The output should look like this:

DuckDuckGo — Privacy, simplified.

If possible, I want to do this with one command from the linux terminal.

I've gotten this far:

wget -qO- www.duckduckgo.com | grep '<title>' | grep '</title>'

To get the line containing the title tags and:

/(?:>).+(?:<)/g

To, according to this website, should output:

>DuckDuckGo — Privacy, simplified.<

When I run it in the terminal, it doesn't. Is there a way to extract the inner html with one command (sed or egrep if possible)?

EDIT:

This is not specificly about extracting text from two html tags but rather about selecting text beetween two given strings without outputting the strings used to select and without using extra software. The suggested question has no answer that solves my problem.

Select text beetween two strings using regex

0 个答案: