I was surprised in my failure to find a script online to download all of an author’s stories from Fiction Press or Fan Fiction.Net, so I threw together the below.
If you go to an author’s page in a browser (only tested in Chrome) it should have all of their stories, and you can run the following script in the console (F12) to grab them all. Their save name format is STORY_NAME_LINK_FORMAT - CHAPTER_NUMBER.html. It works as follows:
Gathers all of the names, chapter 1 links, and chapter counts for each story.
Converts this information into a list of links it needs to download. The links are formed by using the chapter 1 link, and just replacing the chapter number.
It then downloads all of the links to your current browser’s download folder.
Do note that chrome should prompt you to answer “This site is attempting to download multiple files”. So of course, say yes. The script is also designed to detect problems, which would happen if fictionpress changes their html formatting.
//Gather the story information
const Stories=[];
$('.mystories .stitle').each((Index, El) =>
Stories[Index]={Link:$(El).attr('href'), Name:$(El).text()}
);
$('.mystories .xgray').each((Index, El) =>
Stories[Index].NumChapters=/ - Chapters: (\d+) - /.exec($(El).text())[1]
);
//Get links to all stories
const LinkStart=document.location.protocol+'//'+document.location.host;
const AllLinks=[];
$.each(Stories, (_, Story) => {
if(typeof(Story.NumChapters)!=='string' || !/^\d+$/.test(Story.NumChapters))
return console.log('Bad number of chapters for: '+Story.Name);
const StoryParts=/^\/s\/(\d+)\/1\/(.*)$/.exec(Story.Link);
if(!StoryParts)
return console.log('Bad link format for stories: '+Story.Name);
for(let i=1; i<=Story.NumChapters; i++)
AllLinks.push([LinkStart+'/s/'+StoryParts[1]+'/'+i+'/'+StoryParts[2], StoryParts[2]+' - '+i+'.html']);
});
//Download all the links
$.each(AllLinks, (_, LinkInfo) =>
$('a').attr('download', LinkInfo[1]).attr('href', LinkInfo[0])[0].click()
);
jQuery('.blurb.group .heading a[href^="/works"]').map((_, El) => jQuery(El).text()).toArray().join('\n');
When I first created my website 10 years ago, from scratch, I did not want to deal with writing a comment system with HTML markups. And in those days, there weren’t plugins for everything like there is today. My solution was setting up a forum which would contain a topic for every Project, Update, and Post, and have my pages mirror the linked topic’s posts.
I had just put in a quick hack at the time in which the pulled SMF message’s body had links converted from bbcode (there might have been 1 other bbcode I also hooked). I had done this with regular expressions, which was a nasty hack.
So anywho, I finally got around to writing a script that converts SMF messages’ bbcode to HTML and caches it. You can download it here, or see the code below. The script is optimized so that it only ever needs to load SMF code when a post has not yet been cached. Caching happens during the initial loading of an SMF post within the script’s main function, and is discarded if the post is changed.
The script requires that you run the query on line #3 of itself in your SMF database. Directly after that are 3 variables you need to set. The script assumes you are already logged in to the appropriate user. To use it, call “GFTP\GetForumTopicPosts($ForumTopicID)”. I have the functions split up so you can do individual posts too if needed (requires a little extra code).
<?//This SQL command must be ran before using the script//ALTER TABLE smf_messages ADD body_html text, ADD body_md5 char(32) DEFAULT NULL;namespaceGFTP;//Forum database variablesglobal$ForumInfo;$ForumInfo=Array('DBName'=>'YourDatabase_smf','Location'=>'/home/YourUser/www','MessageTableName'=>'smf2_messages',);functionGetForumTopicPosts($ForumTopicID){//Change to the forum databaseglobal$ForumInfo;$CurDB=mysql_fetch_row(mysql_query('SELECT database()'))[0];if($CurDB!=$ForumInfo['DBName'])mysql_select_db($ForumInfo['DBName']);$OldEncoding=SetEncoding(true);//Get the posts$PostsInfos=Array();$PostsQuery=mysql_query('SELECT'.implode(', ', PostFields())." FROM $ForumInfo[MessageTableName] WHERE id_topic='".intval($ForumTopicID)."' AND approved=1 ORDER BY id_msg ASC LIMIT 1, 9999999");if($PostsQuery) //If query failed, do not processwhile(($PostInfo=mysql_fetch_assoc($PostsQuery)) && ($PostsInfos[]=$PostInfo))if(md5($PostInfo['body'])!=$PostInfo['body_md5']) //If the body md5s do not match, get new value, otherwise, use cached value ProcessPost($PostsInfos[count($PostsInfos)-1]); //Process the lastest post as a reference//Restore from the forum databaseif($CurDB!=$ForumInfo['DBName'])mysql_select_db($CurDB); SetEncoding(false, $OldEncoding);//Return the postsreturn$PostsInfos;}functionProcessPost(&$PostInfo) //PostInfo must have fields id_msg, body, body_md5, and body_html{//Load SMFglobal$ForumInfo;if(!defined('SMF')) {global$context;require_once(rtrim($ForumInfo['Location'], DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR.'SSI.php');mysql_select_db($ForumInfo['DBName']); SetEncoding(); }//Update the cached body_html field$ParsedCode=$PostInfo['body_html']=parse_bbc($PostInfo['body']);$EscapedHTMLBody=mysql_escape_string($ParsedCode);$BodyMD5=md5($PostInfo['body']);mysql_query("UPDATE$ForumInfo[MessageTableName] SET body_html='$EscapedHTMLBody', body_md5='$BodyMD5'WHERE id_msg=$PostInfo[id_msg]");}//The fields to select in the Post queryfunctionPostFields() { returnArray('id_msg', 'poster_time', 'id_member', 'subject', 'poster_name', 'body', 'body_md5', 'body_html'); }//Swap character encodings. Needs to be set to utf8functionSetEncoding($GetOld=false, $NewSet=Array('utf8', 'utf8', 'utf8')){//Get the old charset if required$CharacterVariables=Array('character_set_client', 'character_set_results', 'character_set_connection');$OldSet=Array();if($GetOld) {//Fill in variables with default in case they are not foundforeach($CharacterVariablesas$Index=>$Variable)$OldSet[$Variable]='utf8';//Query for the character sets and update the OldSet array$Query=mysql_query('SHOW VARIABLES LIKE "character_%"');while($VariableInfo=mysql_fetch_assoc($Query))if(isset($OldSet[$VariableInfo['Variable_name']]))$OldSet[$VariableInfo['Variable_name']]=$VariableInfo['Value'];$OldSet=array_values($OldSet); //Turn back into numerical array }//Change to the new database encoding$CompiledSets=Array();foreach($CharacterVariablesas$Index=>$Variable)$CompiledSets[$Index]=$CharacterVariables[$Index].'="'.mysql_escape_string($NewSet[$Index]).'"';mysql_query('SET '.implode(', ', $CompiledSets));//If requested, return the previous valuesreturn$OldSet;}?>
It is common knowledge that you can use the FormData class to send a file via AJAX as follows:
var DataToSend=newFormData();
DataToSend.append(PostVariableName, VariableData); //Send a normal variable
DataToSend.append(PostFileVariableName, FileElement.files[0], PostFileName); //Send a filevar xhr=newXMLHttpRequest();
xhr.open("POST", YOUR_URL, true);
xhr.send(DataToSend);
Something that is much less known, which doesn't have any really good full-process examples online (that I could find), is sending a URL's file as the posted file.
This is doable by downloading the file as a Blob, and then directly passing that blob to the FormData. The 3rd parameter to the FormData.append should be the file name.
The following code demonstrates downloading the file. I did not worry about adding error checking.
functionDownloadFile(
FileURL, //http://...Callback, //The function to call back when the file download is complete. It receives the file Blob.ContentType) //The output Content-Type for the file. Example=image/jpeg
{
var Req=newXMLHttpRequest();
Req.responseType='arraybuffer';
Req.onload=function() {
Callback(newBlob([this.response], {type:ContentType}));
};
Req.open("GET", FileURL, true);
Req.send();
}
And the following code demonstrates submitting that file
//User VariablesvarDownloadURL="https://www.castledragmire.com/layout/PopupBG.png";
varPostURL="https://www.castledragmire.com/ProjectContent/WebScripts/Default_PHP_Variables.php";
varPostFileVariableName="MyFile";
varOutputFileName="Example.jpg";
//End of User Variables
DownloadFile(DownloadURL, function(DownloadedFileBlob) {
//Get the data to sendvar Data=newFormData();
Data.append(PostFileVariableName, DownloadedFileBlob, OutputFileName);
//Function to run on completionvarCompleteFunction=function(ReturnData) {
//Add your code in this function to handle the ajax resultvar ReturnText=(ReturnData.responseText ? ReturnData :this).responseText;
console.log(ReturnText);
}
//Normal AJAX examplevar Req=newXMLHttpRequest();
Req.onload=CompleteFunction; //You can also use "onreadystatechange", which is required for some older browsers
Req.open("POST", PostURL, true);
Req.send(Data);
//jQuery example
$.ajax({type:'POST', url:PostURL, data:Data, contentType:false, processData:false, cache:false, complete:CompleteFunction});
});
Unfortunately, due to cross site scripting (XSS) security settings, you can generally only use ajax to query URLs on the same domain. I use my Cross site scripting solutions and HTTP Forwarders for this. Stackoverflow also has a good thread about it.
Although, converting to markdown is a time consuming pain
So I started getting on the Github bandwagon FINALLY. I figured that while I was going to the trouble of remaking readme files for the projects into github markdown files, I might as well duplicate the compiled HTML for my website.
The below code is a simple PHP script to pull in the converted HTML from Github’s API and then do some more modifications to facilitate directly inserting it into a website.
Usage:
The variables that can be updated are all at the top of the file.
The script will always output the finished result to the user’s browser, but can also optionally save it to an external file by setting the $SaveFileName variable.
Stylesheet:
The script automatically includes a specified stylesheet from the $StylesheetLocation variable.
The required modifications that need to be made to the css are to change “body” to “.GHMarkdown”, and then add “.GHMarkdown” before all other rules.
This is the one I am currently using for my website, but it also has a few modifications made specifically for my layouts.
Modifications
In my markdowns, I like to link to internal sections by first creating a bookmark as “<div name="BOOKMARK_NAME">...</div>” and then linking via “[LinkName](#BOOKMARK_NAME)”. While this works on github, the bookmark’s names are actually changed to something like “user-content-BOOKMARK-NAME”, which is not useable outside of github. The first $RegexModifications item therefore updates the bookmarks back to their original name, and turns them into <span>s (which github does not support).
The second rule just removes the “aria-hidden” attributes, which my W3C checking scripts throw a warning on.
Note that sometimes, the script may return an error of “transfer closed with XXX bytes remaining to read”. This means that github denied the request (probably due to too many requests in too short a timespan), but the input is too large so github prematurely terminated the connection. If this happens, try sending a tiny input and see if you get back a proper error.
<?php//Variables$SaveFileName='Output.html';//Optionally save output to a file. Comment out to not save$InputFile='Input.md';$StylesheetLocation='github-markdown.css';$RegexModifications=Array('/<div name="user-content-(.*?)"(.*?)<\/div>/s'=>'<span id="$1"$2</span>',//Change <div name="user-contentXXX ---TO--- <span name="XXX'/ ?aria-hidden="true"/'=>''//Remove aria-hidden attribute);//Set the curl options$CurlHandle=curl_init();//Init curlcurl_setopt_array($CurlHandle,Array(CURLOPT_URL=>'https://api.github.com/markdown/raw',//Markdown/raw takes and returns plain text input and outputCURLOPT_FAILONERROR=>false,CURLOPT_FOLLOWLOCATION=>1,CURLOPT_RETURNTRANSFER=>1,//Return result as a stringCURLOPT_TIMEOUT=>300,CURLOPT_POST=>1,CURLOPT_POSTFIELDS=>file_get_contents($InputFile),//Pull in the requested fileCURLOPT_HTTPHEADER=>Array('Content-type: text/plain'),//Github expects the given data to be plaintextCURLOPT_SSL_VERIFYPEER=>0,//In case there are problems with the PHP ssl chain (often the case in Windows), ignore the errorCURLOPT_USERAGENT=>'Curl/PHP'//Github now requires a useragent to process the request));//Pull in the html converted markdown from Github$Return=curl_exec($CurlHandle);if(curl_errno($CurlHandle))//Check for error$Return=curl_error($CurlHandle);curl_close($CurlHandle);//Make regex modifications$Return=preg_replace(array_keys($RegexModifications),array_values($RegexModifications),$Return);//Generate the final HTML. It will also be output here if not saving to a fileheader('Content-Type: text/html; charset=utf-8');if(isset($SaveFileName))//If saving to a file, buffer outputob_start();?><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><title>Markdown pull</title><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><link href="<?=$StylesheetLocation?>" rel=stylesheet type="text/css"></head><body><div class=GHMarkdown><?=$Return?></div></body></html><?php//Save to a file if requestedif(isset($SaveFileName))file_put_contents($SaveFileName,ob_get_flush());//Actual output happens here too when saving to a file?>
After writing the documentation in plaintext format for DSQL just now, I needed to convert it into HTML for the project’s page. I’ve done this before manually and it’s always very daunting, so I decided to really quickly write a script to do most of the work for me, which can be downloaded here, or the code seen below.
It has the following:
Input text box with HTML data that is instantly shown as HTML in a below section when modified.
Both sections take up half the vertical screen space
Undo/redo buffer for the text box (very primitive functionality)
“Open in new page” button, which opens a new window with just the HTML data (useful for validation [W3C or whatnot]).
This is disabled by default because it is a dangerous option (XSS exploitable, so the script would need to be secured/password protected if this was on)
“Escape HTML” escapes HTML characters so they are not improperly interpreted (e.x. “<” becomes “<”)
Listize:
Turns tabbed lists into HTML
For example:
1
2
3
4
5
would become:
1
2
3
4
5
I realized while making the script that I should probably instead just start making my documentation in a markup (like GitHub’s) and then have that converted to HTML and text files. Oh well.
Code:
<? header('Content-Type: text/html; charset=utf-8'); ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Format Text</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<?
$AllowRenderText=true; //Set to true only if this is in a secure environment, as directly outputting a given value can lead to XSS
if(isset($_REQUEST['RenderText']))
return print '</head><body>'.($AllowRenderText ? $_REQUEST['RenderText'] : 'Rendering of text not allowed').'</body></html>';
?>
<style type="text/css">
html, body { width:100%; height:100%; margin:0; padding:0; }
.HalfScreen { display:block; width:calc(100% - 2px); height:calc(50% - 2px - 30px/2); margin:0; border:1px solid black; }
#RenderForm { overflow:hidden; }
#RenderText { margin:0; border:0; width:100%; height:100%; }
#RenderHTML { overflow-x:hidden; overflow-y:scroll; }
.TopBar { height:30px; background-color:grey; }
.Hide { position:absolute; visibility:hidden; top:-10000px; }
</style>
<script type="text/javascript" src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<script type="text/javascript">$(document).ready(function() {
//History for undoing
var UndoBuf=[], RedoBuf=[];
function Undo()
{
if(!UndoBuf.length)
return;
RedoBuf.push(UndoBuf.pop());
$('#RenderText').val(UndoBuf[UndoBuf.length-1]);
$('#RenderHTML').html(UndoBuf[UndoBuf.length-1]);
}
function Redo()
{
if(!RedoBuf.length)
return;
$('#RenderText').val(RedoBuf[RedoBuf.length-1]);
$('#RenderHTML').html(RedoBuf[RedoBuf.length-1]);
UndoBuf.push(RedoBuf.pop());
}
$('#Undo').click(function(e) { e.preventDefault(); Undo(); });
$('#Redo').click(function(e) { e.preventDefault(); Redo(); });
//Render HTML
function Render() {
//Do the render
var MyVal=$('#RenderText').val();
$('#RenderHTML').html(MyVal);
//Save current value to the history
//*Better history functionality here would be real nice (using smart currentTarget.selectionStart/End calculations), along with an undo/redo button, but not within the scope of this project
if(RedoBuf.length) //Empty redo buffer
RedoBuf=[];
UndoBuf.push(MyVal);
if(UndoBuf.length>100) //Limit history buffer
UndoBuf.shift();
}
$('#RenderText').on('keypress paste', function(e) { setTimeout(Render, 1); }); //Automatic update on paste requires a timeout
//Open in new page
$('#OpenInNewPage').click(function(e) {
e.preventDefault();
$('#RenderForm').submit();
});
//Escape HTML
$('#EscapeHTML').click(function(e) {
e.preventDefault();
$('#RenderText').val(function(index, value) {
$.each({"&":/&/g, "<":/</g, ">":/>/g, """:/"/g, "'":/'/g}, function(HTMLStr, ReplStr) {
value=value.replace(ReplStr, HTMLStr); });
return value;
});
Render();
});
//Listize based on tabbing
//If a successive line is tabbed over beyond the current, it is made inside a new nested list.
//Tabbing over more than once on a successive line will create multiple nests
//Having @@@ at the beginning of a line will include it in the previous line item, no matter the tabbing
//Make sure to have @@@ blank lines tabbed over to the proper nested level
$('#Listize').click(function(e) {
//Get the text to replace
e.preventDefault();
var T=$('#RenderText').val();
//Go over each line and if the next line is tabbed beyond it, make it a new nested list. Blank
var CurTabLevel=0, NewLines=[]; //NewLines is 2 items per line: the original string and the new html tags
$.each(T.split(/\r?\n/), function(Index, Str) {
//Check for a continued line item
if(Str.substr(0, 3)=='@@@')
return NewLines.push('<br>', Str.substr(3));
//In/de-dent as needed
var Tags='';
var NewTabLevel=/^\t*/.exec(Str)[0].length, PreLevel=CurTabLevel; //Get the nested level
for(;NewTabLevel>CurTabLevel;CurTabLevel++)
Tags+='<ul><li>';
for(;NewTabLevel<CurTabLevel;CurTabLevel--)
Tags+='</li></ul>';
//Fill out the rest of the line
if(NewTabLevel==0) //Breaks between top level new lines
Tags+=(Index && PreLevel==0 ? '<br>' : '');
else if(PreLevel>=NewTabLevel) //If previous item needs to be ended (new level is not greater and not 0)
Tags+='</li><li>';
NewLines.push(Tags, Str);
});
//Finish de-dent as needed
var Final=[NewLines.shift()];
var EndLine='';
while(CurTabLevel--)
EndLine+='</li></ul>';
NewLines.push(EndLine);
//Combine each line with the tags
for(var i=0;i<NewLines.length;i+=2)
Final.push(NewLines[i+0]+NewLines[i+1]);
//Update from the replaced text
$('#RenderText').val(Final.join("\n"));
Render();
});
});</script>
</head>
<body>
<div class=TopBar>
<input type=button id=EscapeHTML value="Escape HTML">
<input type=button id=Listize value="Listize">
<? if($AllowRenderText) { ?> <input type=button id=OpenInNewPage value="Open In New Page"> <? } ?>
<input type=button id=Undo value="Undo">
<input type=button id=Redo value="Redo">
</div>
<form action="FormatText.php" method=post id=RenderForm target="_blank" class=HalfScreen>
<textarea id=RenderText name=RenderText></textarea>
<input type=submit class=Hide>
</form>
<div id=RenderHTML class=HalfScreen></div>
</body>
</html>
Here are a few functions I’ve been finding a lot of use for lately. They are basically the JavaScript equivalent for PHP’s htmlentities and html_entity_decode. These functions are useful for inserting HTML dynamically, and getting values of contentEditable fields. These functions do replace line breaks appropriately, and HTML2Text removes a trailing line break.
var TextTransformer=$('<div></div>');
function Text2HTML(T) { return TextTransformer.text(T).html().replace(/\r?\n/g, '<br>'); }
function HTML2Text(T) { return TextTransformer.html(ReplaceBreaks(T, "\x01br\x01")).text().replace(/\x01br\x01/g, "\n").replace(/\n$/, ''); }
function ReplaceBreaks(TheHTML, ReplaceText) { return TheHTML.replace(/<\s*br\s*\/?\s*>/g, ReplaceText || ' - '); }