Serving web pages
Stopping spam and hackers
htaccess
An .htaccess file in a directory on a server tells the server how to handle specified requests or requests from specified internet addresses.
Examples of Apache directives
<Files .htaccess> order allow,deny deny from all </Files>
Prevents anyone from reading the .htaccess file(s).
order allow,deny deny from 59.92.4.185 deny from 66.225.201 deny from 69.80.96.0/20 allow from all
Prevents anyone from 59.92.4.185, addresses that start with 66.225.201, or addresses that match the first 20 bits of 69.80.96.0.
ErrorDocument 403 "Sorry, requests from your IP address are forbidden
Shows the phrase after " when access is denied to a page.
ErrorDocument 403 /errors/403.html
Replies with the 403.html file when access is denied to a page.
ErrorDocument 404 /errors/404.html
Replies with the 404.html file when the request can't be found.
order allow,deny allow from all
Put this in the folder where the 403 file is, and it allows even banned IPs access to the 403 error document (and any other files that are in that folder; better add the Files directive also).
Getting funny characters right
Now that I'm starting, like many, to use UTF-8 encoded data, I need to make sure that PHP connects to MySQL using a UTF-8 pipe. That can be accomplished with a single command in PHP scripts:
mysql_set_charset("utf8");
Processing form input
Putting tags around links automatically
I have forms where people write stories that usually include web links. If I want the links to be clickable, then I usually write <a> tags around the link, repeating the link in the href field. Now I have a string replace function in PHP that will do that automatically:
preg_replace('~[^>"](https?://\S+)~', '<a href="$1">$1</a>', $text); //$text being the input text
- Explaining the function
- preg_replace searches for the substrings of the third argument ($text here) that match the first argument, replacing them with the second argument.
- ~...~, the search string starts and ends with tildes, just to delimit the beginning and end of the regular expression...They are allowed to be most any symbol, preferably one we're not using in the search string. Slashes and quotes are often used, but we have those two symbols in our search string itself, so I chose tildes.
- [^>"] means "anything but greater than or double-quotes."We make sure we're not finding links that already appear right after quotation marks or greater-than symbols, since that would mean the link already is part of a tag.
- (...)The parentheses mean this section of the found string will be needed later, so save it in a variable, $1.
- http must begin the string we're looking for.
- s?, the question mark makes the s optional, allowing zero or one "s", since https: is just as valid as http:
- ://Then must follow a colon and two forward slashes.
- \S+Then can follow any number of non-whitespace characters, signified by "\S", the plus sign meaning one or more of whatever precedes itself.
- Problems with this function
- A period being a non-whitespace character, if the writer puts a period after the link, as in "http://energyteachers.org." we will catch that period in our search, and the link will fail. We should consider making the last character either a letter, a slash, a number, a question mark, or an ampersand, because links could be like any of the following: