UGN Security Forums
My ProfileMember DirectoryLogin
Search our ForumsView our FAQView our Site Rules
View our CalendarView our Active TopicsGo to our Main Page

UGN Security Store
 

Network Sites UGN Security, The GoNix Initiative, Elite Web Gamers, Back of the Web, EveryDay Helper, VNC Web Design & Development
Sponsored Links
Latest Postings
Latest Reviews
Topic Options
Rate This Topic
#42935 - 07/01/07 12:04 AM Utilzing SE Friendly URLs in PHP
Gremelin Offline

Community Owner
*****

Registered: 02/28/02
Posts: 7193
Loc: Portland, OR; USA
Disclaimer
Please note that all code contained in this thread was created, and belongs to, VNC Web Design and Development and may be freely used for non-commercial purposes with the agreement that credit will be visibly identifiable within your application and within code. Commercial applications must obtain permission before utilizing code.

This article may be reproduced and republished with prior written permission from myself. This code is being provided here for example, but should fully work "out of the box".

Onward!
One thing that I've always been an advocate of is web standardization; a loose part of this can be SE Friendly URLs which roughly do away with certain characters in URLs to pages which Search Engines (or poorly coded web browsers) dislike.

These characters can be, but are not limited to:
&, ?, =

There are several ways to go about this, and I'll introduce two which I've used; these aren't the "only" way to go about this, but they are rather simple and efficient.

Option A, mod_rewrite
Not necessarily my favorite method of things, it works and it works well; in fact, our IRC information page here on UGN Security utilizes this.

Please note that there are MANY ways to do this via mod_rewrite, and I'm sure there are more efficient ways of doing it than I use below, but this is a good starter way of allowing SE friendly URLs via mod_rewrite:

Example .htaccess entry:
 Code:
# Tell mod_rewrite we're wanting to utilize it
RewriteEngine on

# SE Friendly URLs
RewriteRule ^irc/(.*)/(.*).php /irc.php?section=$1&channel=$2


This will allow SE Friendly URLs on a script named irc.php with the section of $1 (which is the first (.*)) and "channel" of $2 (which is the second (.*)).

So, accessing the page as:
http://www.undergroundnews.com/irc/chat/staff.php

You'll see that the section is "chat" and the channel is #staff. .php is just there as a virtual extension and isn't needed (but it is there none the less).

Another way of doing this is via the path_info variable; I like this method more as it allows all options to be worked on via php and can be adjusted a lot easier and is well more powerful.

For users of Apache2 you'll at times need to "turn on" path info in your .htaccess file as:
 Code:
AcceptPathInfo On


The coding I tend to go with for utilizing Path Info in PHP is:
 Code:
// Path Info Translation
// ------------------------------
// Take the path from the URL.
	$path = strip_tags(addslashes(htmlspecialchars($_SERVER["PATH_INFO"])));

// Build an array from the path.
	$translation = preg_split("/[\/]+/", $path);
	unset($translation['0']);


What this does is reads the path after your script name (in this case articles.php) and splits it into an array. After the array is split it unsets the first row as it will always be empty (so there is no point in allowing it to stay).

Now, translating these into our script is done via:
 Code:
// Split the array into useable chunks.
	if($translation["1"] == "category") { $category = (int)$translation["2"]; }
	elseif($translation["1"] == "task") { $task = strip_tags(addslashes(htmlspecialchars($translation["2"]))); }
	elseif($translation["1"] == "article") { $article = (int)$translation["2"]; }
	if($translation["3"] == "page") { $page == (int)$translation["4"]; }


Which basically reads, if line2 of the array is one of the 3 possible variables (category, task, or article) to pass the value of line3 to the script. If line4 contains the variable "page" it passes the value of line5 to the script as the page number.

This looks like one of the following:
articles.php/category/21/page/1
articles.php/category/21
articles.php/article/54
articles.php/task/rss

You could also pass a virtual extension (.html, .php, etc) if you'd like to do so, however you'd want to make sure the script knows to filter it out so it's not passed to the parser.
_________________________
Donate to UGN Security here.
UGN Security, Back of the Web, Elite Web Gamers & VNC Web Design Owner

Top
Sponsored Links
      
#42936 - 07/01/07 12:11 AM Re: Utilzing SE Friendly URLs in PHP [Re: Gremelin]
Gremelin Offline

Community Owner
*****

Registered: 02/28/02
Posts: 7193
Loc: Portland, OR; USA
 Originally Posted By: Gizmo
 Code:
// Split the array into useable chunks.
	if($translation["1"] == "category") { $category = (int)$translation["2"]; }
	elseif($translation["1"] == "task") { $task = strip_tags(addslashes(htmlspecialchars($translation["2"]))); }
	elseif($translation["1"] == "article") { $article = (int)$translation["2"]; }
	if($translation["3"] == "page") { $page == (int)$translation["4"]; }


figure I should referance my "security" in place here and what they differant items mean...

(int) means "accept only a number"
strip_tags means to strip any markup code
add_slashes means to add \ to any "'s
htmlspecialchars means to convert non-ascii elements to their ascii varient.
_________________________
Donate to UGN Security here.
UGN Security, Back of the Web, Elite Web Gamers & VNC Web Design Owner

Top
#42938 - 07/01/07 02:57 AM Re: Utilzing SE Friendly URLs in PHP [Re: Gremelin]
Gremelin Offline

Community Owner
*****

Registered: 02/28/02
Posts: 7193
Loc: Portland, OR; USA
For anyone wondering what my actual code block looks like (including the virtual extension):
 Code:
// Path Info Translation
// ------------------------------
// Lets set some variables
	$fake_html_extension = ".html";
	$fake_rss_extension = ".rss";

// Take the path from the URL.
	$path = strip_tags(addslashes(htmlspecialchars($_SERVER["PATH_INFO"])));

// Lets weed out any "baddies"
//	$replaces = array(".xml", ".rss", ".php", ".html", ".htm", ".shtml");
	$replaces = array($fake_html_extension, $fake_rss_extension);
	$path = str_replace($replaces, "", $path);

// Build an array from the path.
	$translation = preg_split("/[\/]+/", $path);
	unset($translation['0']);

// Split the array into useable chunks.
	if($translation["1"] == "category") { $category = (int)$translation["2"]; }
	elseif($translation["1"] == "task") { $task = strip_tags(addslashes(htmlspecialchars($translation["2"]))); }
	elseif($translation["1"] == "article") { $article = (int)$translation["2"]; }
	if($translation["3"] == "page") { $page == (int)$translation["4"]; }
// ------------------------------
// End Path Info Translation


As long as line 4 isn't "page" you can use it as a "virtual extension" or even SEO URLs; as anything placed there will be ignored by the parser; so by running it through my seo_urls function you can push them as:
articles.php/category/8/text.html
articles.php/article/53/text.html

My seo_urls function, and my "sanitize" function (for ensuring no "invalid" data is passed to the urls) is as follows:

 Code:
function shorten_length($str, $start, $end) {
	if(strlen($str) > $end) { $str = substr($str, $start, $end); }
	return($str);
}

function make_sane($str) {
	$str = htmlentities(htmlspecialchars($str));

	$patterns = array("’", "“", "”");
	$replaces = array("'", """, """);
	$str = str_replace($patterns, $replaces, $str);

	return($str);
}

function seo_titles($str, $type, $shorten) {
// Lets eliminate bad content
	$str = htmlspecialchars(make_sane($str, ENT_QUOTES));
	if($shorten == 1) { $str = shorten_length($str, 0, 50); }

	$patterns = array(""", "'", "<", ">", "&", """, "\\", "|", "[", "{", "]", "}", "?", "!", "@", "#", "$", "%", "^", "&", "*", "(", ")", "+", "=", ";", ":", ",", ".", "'", ":", ";");
	$str = strtolower(str_replace($patterns, "", $str));

	$patterns = array(" ", "%20");
	if($type == 1) { $replaces = "-"; }
	else { $replaces = "_"; }

	$str = str_replace($patterns, $replaces, $str);
	return($str);
}
_________________________
Donate to UGN Security here.
UGN Security, Back of the Web, Elite Web Gamers & VNC Web Design Owner

Top

Moderator:  §intå×, Gremelin 
Featured Member
Registered: 02/28/02
Posts: 7193
Forum Stats
2148 Members
46 Forums
35119 Topics
70289 Posts

Max Online: 1567 @ 04/25/10 05:20 AM
Top Posters
UGN Security 28281
Gremelin 7193
§intå× 3255
SilentRage 1273
Ice 1146
pergesu 1136
Infinite 1041
jonconley 955
Girlie 908
unreal 860
Newest Members
Tim050, Gecko666, defghi795767, Devo60, ali
2148 Registered Users
Who's Online
0 registered (), 354 Guests and 274 Spiders online.
Key: Admin, Global Mod, Mod
Latest News


Donate
  Get Firefox!
Get FireFox!