Extract HTML Content - Control with htmlq | Online Free DevTools by Hexmos
Extract HTML content with htmlq, a CSS selector tool. Query HTML files efficiently from the command line with this free online tool, no registration required.
htmlq
Use CSS selectors to extract content from HTML files. More information: https://github.com/mgdm/htmlq.
- Return all elements of class
card:
cat {{path/to/file.html}} | htmlq '.card'
- Get the text content of the first paragraph:
cat {{path/to/file.html}} | htmlq --text 'p:first-of-type'
- Find all the links in a page:
cat {{path/to/file.html}} | htmlq --attribute href 'a'
- Remove all images and SVGs from a page:
cat {{path/to/file.html}} | htmlq --remove-nodes 'img' --remove-nodes 'svg'
- Pretty print and write the output to a file:
htmlq --pretty --filename {{path/to/input.html}} --output {{path/to/output.html}}
