kses
[ class tree: kses ] [ index: kses ] [ all elements ]

Class: kses4

Source Location: /main/inc/lib/kses-0.2.2/oop/php4.class.kses.php

Class Overview


Kses strips evil scripts!


Author(s):

  • Richard R. V�squez, Jr. (Original procedural code by Ulf H�rnhammar)

Version:

  • PHP4 OOP 0.2.2

Copyright:

  • Richard R. V�squez, Jr. 2003-2005

Variables

Methods



Class Details

[line 82]
Kses strips evil scripts!

This class provides the capability for removing unwanted HTML/XHTML, attributes from tags, and protocols contained in links. The net result is a much more powerful tool than the PHP internal strip_tags()

This is a fork of a slick piece of procedural code called 'kses' written by Ulf Harnhammar The entire set of functions was wrapped in a PHP object with some internal modifications by Richard Vasquez (http://www.chaos.org/) 7/25/2003

This upgrade provides the following:

  • Version number synced to procedural version number
  • PHPdoc style documentation has been added to the class. See http://www.phpdoc.org/ for more info.
  • Some methods are now deprecated due to nomenclature style change. See method documentation for specifics.
  • Kses4 now works in E_STRICT
  • Addition of methods AddProtocols(), filterKsestextHook(), RemoveProtocol(), RemoveProtocols() and SetProtocols()
  • Deprecated _hook(), Protocols()
  • Integrated code from kses 0.2.2 into class.




Tags:

author:  Richard R. V�squez, Jr. (Original procedural code by Ulf H�rnhammar)
version:  PHP4 OOP 0.2.2
copyright:  Richard R. V�squez, Jr. 2003-2005
link:  http://chaos.org/contact/ Contact page with current email address for Richard Vasquez
link:  http://sourceforge.net/projects/kses/ Home Page for Kses
license:  GNU Public License


[ Top ]


Class Variables

$allowed_html = array()

[line 89]



Tags:

access:  private

Type:   array


[ Top ]

$allowed_protocols = array()

[line 88]



Tags:

access:  private

Type:   array


[ Top ]



Class Methods


constructor kses4 [line 99]

kses4 kses4( )

Constructor for kses.

This sets a default collection of protocols allowed in links, and creates an empty set of allowed HTML tags.




Tags:

since:  PHP4 OOP 0.0.1


[ Top ]

method AddHTML [line 332]

bool AddHTML( [string $tag = ""], [array $attribs = array()])

Adds valid (X)HTML with corresponding attributes that will be kept when stripping 'evil scripts'.

This method accepts one argument that can be either a string or an array of strings. Invalid data will be ignored.




Tags:

return:  Status of Adding (X)HTML and attributes.
since:  PHP4 OOP 0.0.1
access:  public


Parameters:

string   $tag   (X)HTML tag that will be allowed after stripping text.
array   $attribs   Associative array of allowed attributes - key => attribute name - value => attribute parameter

[ Top ]

method AddProtocol [line 213]

bool AddProtocol( [string $protocol = ""])

Adds a single protocol to $this->allowed_protocols.

This method accepts a string argument and adds it to the list of allowed protocols to keep when performing Parse().




Tags:

return:  Status of adding valid protocol.
since:  PHP4 OOP 0.0.1
access:  public


Parameters:

string   $protocol   The name of the protocol to be added.

[ Top ]

method AddProtocols [line 150]

bool AddProtocols( mixed 0)

Allows for single/batch addition of protocols

This method accepts one argument that can be either a string or an array of strings. Invalid data will be ignored.

The argument will be processed, and each string will be added via AddProtocol().




Tags:

return:  Status of adding valid protocols.
see:  kses4::AddProtocol()
since:  PHP4 OOP 0.2.1
access:  public


Parameters:

mixed   0   , A string or array of protocols that will be added to the internal list of allowed protocols.

[ Top ]

method DumpElements [line 315]

array DumpElements( )

Raw dump of allowed (X)HTML elements

This returns an indexed array of allowed (X)HTML elements and attributes for a particular KSES instantiation.




Tags:

return:  The list of allowed elements.
since:  PHP4 OOP 0.2.2
access:  public


[ Top ]

method DumpProtocols [line 300]

array DumpProtocols( )

Raw dump of allowed protocols

This returns an indexed array of allowed protocols for a particular KSES instantiation.




Tags:

return:  The list of allowed protocols.
since:  PHP4 OOP 0.2.2
access:  public


[ Top ]

method filterKsesTextHook [line 573]

string filterKsesTextHook( string $string)

Allows for additional user defined modifications to text.

This method allows for additional modifications to be performed on a string that's being run through Parse(). Currently, it returns the input string 'as is'.

This method is provided for users to extend the kses class for their own requirements.




Tags:

return:  User modified string.
see:  kses4::Parse()
since:  PHP5 OOP 1.0.0
access:  public


Parameters:

string   $string   String to perfrom additional modifications on.

[ Top ]

method Parse [line 122]

string Parse( [string $string = ""])

Basic task of kses - parses $string and strips it as required.

This method strips all the disallowed (X)HTML tags, attributes and protocols from the input $string.




Tags:

return:  The stripped string
since:  PHP4 OOP 0.2.1
access:  public


Parameters:

string   $string   String to be stripped of 'evil scripts'

[ Top ]

method Protocols [line 189]

bool Protocols( )

Allows for single/batch addition of protocols



Tags:

see:  kses4::AddProtocols()
deprecated:  Use AddProtocols()
since:  PHP4 OOP 0.0.1


[ Top ]

method RemoveProtocol [line 392]

bool RemoveProtocol( [string $protocol = ""])

Removes a single protocol from $this->allowed_protocols.

This method accepts a string argument and removes it from the list of allowed protocols to keep when performing Parse().




Tags:

return:  Status of removing valid protocol.
since:  PHP4 OOP 0.2.1
access:  public


Parameters:

string   $protocol   The name of the protocol to be removed.

[ Top ]

method RemoveProtocols [line 438]

bool RemoveProtocols( mixed 0)

Allows for single/batch removal of protocols

This method accepts one argument that can be either a string or an array of strings. Invalid data will be ignored.

The argument will be processed, and each string will be removed via RemoveProtocol().




Tags:

return:  Status of removing valid protocols.
see:  kses4::RemoveProtocol()
since:  PHP5 OOP 0.2.1
access:  public


Parameters:

mixed   0   , A string or array of protocols that will be removed from the internal list of allowed protocols.

[ Top ]

method SetProtocols [line 257]

bool SetProtocols( mixed 0)

Allows for single/batch replacement of protocols

This method accepts one argument that can be either a string or an array of strings. Invalid data will be ignored.

Existing protocols will be removed, then the argument will be processed, and each string will be added via AddProtocol().




Tags:

return:  Status of replacing valid protocols.
see:  kses4::AddProtocol()
since:  PHP4 OOP 0.2.2
access:  public


Parameters:

mixed   0   , A string or array of protocols that will be the new internal list of allowed protocols.

[ Top ]

method _array_lc [line 586]

array _array_lc( $inarray, array $in_array)

This method goes through an array, and changes the keys to all lower case.



Tags:

return:  Modified array
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

array   $in_array   Associative array
   $inarray  

[ Top ]

method _attr [line 699]

string _attr( string $element, string $attr)

This method strips out disallowed attributes for (X)HTML tags.

This method removes all attributes if none are allowed for this element. If some are allowed it calls $this->_hair() to split them further, and then it builds up new HTML code from the data that $this->_hair() returns. It also removes "<" and ">" characters, if there are any left. One more thing it does is to check if the tag has a closing XHTML slash, and if it does, it puts one in the returned code as well.




Tags:

return:  Resulting valid (X)HTML or ''
see:  kses4::_hair()
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $element   (X)HTML tag to check
string   $attr   Text containing attributes to check for validity.

[ Top ]

method _bad_protocol [line 914]

string _bad_protocol( string $string)

This method removes disallowed protocols.

This method removes all non-allowed protocols from the beginning of $string. It ignores whitespace and the case of the letters, and it does understand HTML entities. It does its work in a while loop, so it won't be fooled by a string like "javascript:javascript:alert(57)".




Tags:

return:  String with removed protocols
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string   String to check for protocols

[ Top ]

method _bad_protocol_once [line 942]

string _bad_protocol_once( string $string)

Helper method used by _bad_protocol()

This function searches for URL protocols at the beginning of $string, while handling whitespace and HTML entities. Function updated to fix security vulnerability (see http://projects.dokeos.com/index.php?do=details&task_id=2312)




Tags:

return:  String with removed protocols
see:  kses4::_bad_protocol()
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string   String to check for protocols

[ Top ]

method _bad_protocol_once2 [line 964]

string _bad_protocol_once2( string $string)

Helper method used by _bad_protocol_once() regex

This function processes URL protocols, checks to see if they're in the white- list or not, and returns different data depending on the answer.




Tags:

return:  String with removed protocols
see:  kses4::_bad_protocol_once()
see:  kses4::_bad_protocol()
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string   String to check for protocols

[ Top ]

method _check_attr_val [line 1009]

bool _check_attr_val( string $value, string $vless, string $checkname, string $checkvalue)

This function performs different checks for attribute values.

The currently implemented checks are "maxlen", "minlen", "maxval", "minval" and "valueless" with even more checks to come soon.




Tags:

return:  Indicates whether the check passed or not
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $value   The value of the attribute to be checked.
string   $vless   Indicates whether the the value is supposed to be valueless
string   $checkname   The check to be performed
string   $checkvalue   The value that is to be checked against

[ Top ]

method _decode_entities [line 1136]

string _decode_entities( $string, string $value)

Decodes numeric HTML entities

This method decodes numeric HTML entities (&#65; and &#x41;). It doesn't do anything with other entities like &auml;, but we don't need them in the URL protocol white listing system anyway.




Tags:

return:  Decoded entity
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $value   The entitiy to be decoded.
   $string  

[ Top ]

method _hair [line 789]

array _hair( string $attr)

This method combs through an attribute list string and returns an associative array of attributes and values.

This method does a lot of work. It parses an attribute list into an array with attribute data, and tries to do the right thing even if it gets weird input. It will add quotes around attribute values that don't have any quotes or apostrophes around them, to make it easier to produce HTML code that will conform to W3C's HTML specification. It will also remove bad URL protocols from attribute values.




Tags:

return:  Associative array containing data on attribute and value
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $attr   Text containing tag attributes for parsing

[ Top ]

method _hook [line 552]

string _hook( string $string)

Allows for additional user defined modifications to text.



Tags:

see:  kses4::filterKsesTextHook()
deprecated:  use filterKsesTextHook()
since:  PHP4 OOP 0.0.1


Parameters:

string   $string  

[ Top ]

method _html_error [line 1119]

string _html_error( string $string)

helper method for _hair()

This function deals with parsing errors in _hair(). The general plan is to remove everything to and including some whitespace, but it deals with quotes and apostrophes as well.




Tags:

return:  string stripped of whitespace
see:  kses4::_hair()
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string   The string to be stripped.

[ Top ]

method _js_entities [line 491]

string _js_entities( string $string)

This function removes the HTML JavaScript entities found in early versions of Netscape 4.



Tags:

return:  String without any NULL/chr(173)
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string  

[ Top ]

method _normalize_entities [line 507]

string _normalize_entities( string $string)

Normalizes HTML entities

This function normalizes HTML entities. It will convert "AT&T" to the correct "AT&amp;T", "&#00058;" to "&#58;", "&#XYZZY;" to "&amp;#XYZZY;" and so on.




Tags:

return:  String with normalized entities
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string  

[ Top ]

method _normalize_entities2 [line 538]

string _normalize_entities2( string $i)

Helper method used by normalizeEntites()

This method helps normalizeEntities() to only accept 16 bit values and nothing more for &#number; entities.

This method helps normalize_entities() during a preg_replace() where a &#(0)*XXXXX; occurs. The '(0)*XXXXXX' value is converted to a number and the result is returned as a numeric entity if the number is less than 65536. Otherwise, the value is returned 'as is'.




Tags:

return:  Normalized numeric entity
see:  kses4::_normalize_entities()
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $i  

[ Top ]

method _no_null [line 475]

string _no_null( string $string)

This method removes any NULL or characters in $string.



Tags:

return:  String without any NULL/chr(173)
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string  

[ Top ]

method _split [line 620]

string _split( string $string)

This method searched for HTML tags, no matter how malformed. It also matches stray ">" characters.



Tags:

return:  HTML tags
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string  

[ Top ]

method _split2 [line 644]

string _split2( string $string)

This method strips out disallowed and/or mangled (X)HTML tags along with assigned attributes.

This method does a lot of work. It rejects some very malformed things like <:::>. It returns an empty string if the element isn't allowed (look ma, no strip_tags()!). Otherwise it splits the tag into an element and an allowed attribute list.




Tags:

return:  Modified string minus disallowed/mangled (X)HTML and attributes
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string  

[ Top ]

method _stripslashes [line 1101]

string _stripslashes( string $string)

Changes \" to "

This function changes the character sequence \" to just " It leaves all other slashes alone. It's really weird, but the quoting from preg_replace(//e) seems to require this.




Tags:

return:  string stripped of \"
since:  PHP4 OOP 0.0.1
access:  private


Parameters:

string   $string   The string to be stripped.

[ Top ]

method _version [line 1153]

string _version( )

Returns PHP4 OOP version # of kses.

Since this class has been refactored and documented and proven to work, I'm syncing the version number to procedural kses.




Tags:

return:  Version number
since:  PHP4 OOP 0.0.1
access:  public


[ Top ]


Documentation generated on Thu, 12 Jun 2008 14:12:35 -0500 by phpDocumentor 1.4.1