Jump to content

How to strip tag


Help!php

Recommended Posts

I have a paragrpah which has tags that needs to be stripped off.

 

so the paragraph looks like

 

<div id="ctl00_placeholderMain_pnlInTheBox" class="tabitem">                    <p>                      HP LaserJet 9050 printer<br/> Power cord<br/> Parallel cable<br/> HP LaserJet Q8543X Smart print cartridge<br/> Printer documentation<br/> Printer software CD<br/> Control panel overlay<br/> Face-up output bin<br/> Two 500-sheet input tray<br/> 100 Sheet Multipurpose Tray<br/> HP JetDirect Fast</p>                </div>

 

I want it to look like

 

HP LaserJet 9050 printer

Power cord

Parallel cable

HP LaserJet Q8543X Smart print cartridge

Printer documentation

Printer software CD

Control panel overlay

Face-up output bin

Two 500-sheet input tray

100 Sheet Multipurpose Tray

HP JetDirect Fast

 

How would I go on about doing this..

 

currently i use

 

$inbox = $html->find( "#ctl00_placeholderMain_pnlInTheBox" );

		if ( isset( $inbox[ 0 ] ) )
		{



			$box =( $inbox[0] );

			$box = strpos($box, ';') !== FALSE ? substr( $box, strpos( $box, ";" ) + 1 ) : $box;  


		}
		else
		{
			$box = "0";
		}

Link to comment
Share on other sites

His output doesn't show any HTML, not even BR tags. It looks like BR tags should be replaced with a line break. So, I would use a preg_replace() to change any BR tags to line breaks then use strip_tags() to remove any and all remaining tags. Note: I'd use preg_replace() instead of str_replace() to cover all the variations of BR tags.

Link to comment
Share on other sites

Based upon the OPs description and example input/output data this should work

function removeHTML($inputStr)
{
    $outputStr = preg_replace('#<br>|<br/>#i', "\n", $inputStr);
    $outputStr = strip_tags($outputStr);
    return $outputStr;
}

Link to comment
Share on other sites

Based upon the OPs description and example input/output data this should work

function removeHTML($inputStr)
{
    $outputStr = preg_replace('#<br>|<br/>#i', "\n", $inputStr);
    $outputStr = strip_tags($outputStr);
    return $outputStr;
}

 

A little more robust pattern:

$outputStr = preg_replace('#<br[\s\/]*>#i', "\n", $inputStr);

 

His output doesn't show any HTML, not even BR tags. It looks like BR tags should be replaced with a line break.

 

I'm not sure I agree with that assessment, because you would only see the BR tags if he posted the source.

 

In any case, if he wanted BR instead of a newline he can just run nl2br after your function. That way, BRs are preserved but all other HTML is removed.

Link to comment
Share on other sites

His output doesn't show any HTML, not even BR tags. It looks like BR tags should be replaced with a line break.

 

I'm not sure I agree with that assessment, because you would only see the BR tags if he posted the source.

Well, his input data explicitly showed the BR (and other) tags and his required output explicitly excluded all of the tags. So, I took that to mean he wanted the HTML markup removed and logical line breaks replacing the HTML linebreak tags.Of course he did state

"I want it to look like" and that could be construed to only mean the "displayed" output. But, in that case the original content was fine to begin with.

 

In any case, the fact that we interpreted the requirement differently means the request was not clear.

Link to comment
Share on other sites

Thank you so much for everyone answer.

 

I am a her. :)...

 

I apologies for my confusion.

 

Without the strip tag.. it shows

 

<div id="ctl00_placeholderMain_pnlInTheBox" class="tabitem">                      <p>                      HP LaserJet 9050 printer<br/> Power cord<br/> Parallel cable<br/> HP LaserJet Q8543X Smart print cartridge<br/> Printer documentation<br/> Printer software CD<br/> Control panel overlay<br/> Face-up output bin<br/> Two 500-sheet input tray<br/> 100 Sheet Multipurpose Tray<br/> HP JetDirect Fast</p>                </div>

 

All i wanted to do is... get rid of

 

]<div id="ctl00_placeholderMain_pnlInTheBox" class="tabitem">                      <p> 

 

and

 

</p>                </div>[/

 

Thats all.

Link to comment
Share on other sites

Then something like this should do that (modifying Psycho's function):

function removeHTML($inputStr)
{
    $outputStr = preg_replace('#<br[\s\/]*>#i', "\n", $inputStr);
    $outputStr = strip_tags($outputStr);
    $outputStr = nl2br($outputStr);
    return $outputStr;
}

 

This will: 1. Convert all variations of <br> to newline characters, 2. Remove all HTML tags, and 3. Convert newline characters back to <br>.

 

So you will remove all HTML while preserving line breaks.

Link to comment
Share on other sites

This function does not modify any attributes on the tags that you allow using allowable_tags' date=' including the style and onmouseover attributes that a mischievous user may abuse when posting text that will be shown to other users. [/quote']

 

It doesn't change the attributes of the tags that it keeps. So something like

<i onmouseover="leetHaxorsFunction();">

will still exist.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.