PHP Parse HTML code

Friday, 25 November 2016

PHP Parse HTML code

How can I parse HTML code held in a PHP variable if it something like:

T1
Lorem ipsum.T2
The quick red fox...T3
... jumps over the lazy brown FROG!

I want to only get the text that's between the headings and I understand that it's not a good idea to use Regular Expressions.

Answer

Use PHP Document Object Model:

   $str = 'T1
Lorem ipsum.T2
The quick red fox...T3
... jumps over the lazy brown FROG';
   $DOM = new DOMDocument;
   $DOM->loadHTML($str);

   //get all H1

   $items = $DOM->getElementsByTagName('h1');

   //display all H1 text
   for ($i = 0; $i < $items->length; $i++)
        echo $items->item($i)->nodeValue . "
";
?>

This outputs as:

 T1
 T2
 T3

[EDIT]: After OP Clarification:

If you want the content like Lorem ipsum. etc, you can directly use this regex:

   $str = 'T1
Lorem ipsum.T2
The quick red fox...T3
... jumps over the lazy brown FROG';
   echo preg_replace("#.*?#", "", $str);
?>

this outputs:

Lorem ipsum.The quick red fox...... jumps over the lazy brown FROG

Blog

Friday, 25 November 2016

PHP Parse HTML code

T1

T2

T3

T1

T2

T3

T1

T2

T3

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?