Home Page
Archive > Posts > 2009 > December
Search:

chunk
Useful little Unix like utility for command line

I needed a command line utility for Bash (for both Windows and Linux) that only outputs bytes between 2 points in a file to STDOUT. head and tail weren’t really cutting it so I figured I’d throw something together. Below is the source code for the result, which I call chunk (Windows Executable).

I compiled the file as c++, but it should be c99 compatible. The file has been tested as compilable for: GCC4 on Slackware, GCC3 on Red Hat, and GCC3 on MingW (Windows [WIN32 should be defined by the compiler]).


Chunk outputs bytes between 2 points in a file to STDOUT. The parameters are:
1) The file
2) The byte offset to start at (hex is supported like 0xA)
3) The number of bytes to output. If not given, the end of the file is assumed.

The source is as follows:
//Copyright 2009 by Dakusan (http://www.castledragmire.com/Copyright). Licensed under Dakusan License v2.0 (http://www.castledragmire.com/Misc/Software_Licenses/Dakusan_License_v2.0.php).
//See http://www.castledragmire.com/Posts/chunk for more information

#define __LARGE64_FILES
#include <stdio.h>
#include <stdlib.h> //strtoull

#ifdef WIN32 //STDOUT only needs to be set to binary mode in windows
	#include <io.h> //_setmode
	#include <fcntl.h> //_O_BINARY
#endif

typedef unsigned long long UINT64;
const UINT64 MaxSizeToRead=1024*1024*10; //The maximum number of bytes to read at a time to our buffer (Must be < 2^31)

UINT64 GetNumberFromString(const char* S) //Extract both hexidecimal and decimal numbers from a string
{
	bool IsHex=S[0]=='0' && (S[1]|32=='x'); //If string starts as 0x, then is a hex number
	return strtoull(S+(IsHex ? 2 : 0), NULL, IsHex ? 16 : 10); //Hex number starts after 2 characters and uses base 16
}

int main(int argc, char *argv[], char *envp[])
{
	//Determine if proper number of parameters are passed, and if not, output help info
	if(argc!=3 && argc!=4)
		return fprintf(stderr, "Chunk outputs bytes between 2 points in a file to STDOUT. The parameters are:\n1) The file\n2) The byte offset to start at (hex is supported like 0xA)\n3) The number of bytes to output. If not given, the end of the file is assumed.\n") & 0;

	//Open the file and get its length
	FILE *TheFile=fopen64(argv[1], "rb");
	if(TheFile==NULL)
		return fprintf(stderr, "File not found or cannot open file\n") & 0;
	fseeko64(TheFile, 0, SEEK_END); //Get the length by seeking to the end
	UINT64 FileSize=ftello64(TheFile);

	//Determine the requested start offset
	UINT64 Offset=GetNumberFromString(argv[2]), SizeToOutput;
	if(Offset>=FileSize)
	{
		fprintf(stderr, "Offset is larger than file's size\n");
		fclose(TheFile);
		return 0;
	}

	//Determine the size to read
	if(argc==3) //If no final parameter, read to the end of the file
		SizeToOutput=FileSize-Offset;
	else //Determine from the 3rd parameter
	{
		SizeToOutput=GetNumberFromString(argv[3]);
		if(Offset+SizeToOutput>FileSize)
		{
			fprintf(stderr, "Requested size is larger than the file, truncating to end of file\n");
			SizeToOutput=FileSize-Offset;
		}
		else if(!SizeToOutput) //If nothing to output, exit prematurely
		{
			fclose(TheFile);
			return 1;
		}
	}

	//Output requested data 10MB at a time from the file to STDOUT
	char *Buffer=new char[SizeToOutput>MaxSizeToRead ? MaxSizeToRead : SizeToOutput]; //Only allocate as many bytes to our read buffer as is necessary
	fseeko64(TheFile, Offset, SEEK_SET); //Seek to the beginning read offset of our file
	#ifdef WIN32 //STDOUT only needs to be set to binary mode in windows
		_setmode(_fileno(stdout), _O_BINARY);
	#endif
	while(SizeToOutput) //Keep reading and outputting until requested data is complete
	{
		UINT64 SizeToRead=SizeToOutput>MaxSizeToRead ? MaxSizeToRead : SizeToOutput; //Number of bytes to read and write
		fread(Buffer, SizeToRead, 1, TheFile); //Read the data
		fwrite(Buffer, SizeToRead, 1, stdout); //Write the data to STDOUT
		SizeToOutput-=SizeToRead; //Decrease number of bytes we still need to read
	}

	//Cleanup
	delete[] Buffer;
	fclose(TheFile);
	return 1;
}
cPanel Hard Linked VirtFS Causing Quota Problems
Quota problems are annoying and expensive to debug :-\

For a really long time I’ve seen many cPanel servers have improper space reporting and quota problems due to the VirtFS system, which is used for the “jailshell” login for users (if a user logs into SSH [or presumably Telnet] whom has their shell set to “/usr/local/cpanel/bin/jailshell” in “/etc/passwd”). Whenever a user logs into their jailshell, it creates a virtual directory structure to chroot them into, and “hard links” their home directory inside this virtfs directory, which doubles their apparent hard disk usage from their home directory (does not include their sql files, for example). Sometimes, the virtfs directory is not unmounted (presumably if the user does not log out correctly, which I have not confirmed), so the doubled hard disk usage is permanent, causing them to go over their quota if a full quota update is ran.

I had given this up as a lost cause a while back, because all the information I could find on the Internet said to just leave it alone and not worry about it, and that it couldn’t be fixed. I picked the problem back up today when one of our servers decided to do a quota check update and a bunch of accounts went over. It seems a script was added recently at “/scripts/clear_orphaned_virtfs_mounts” that fixes the problem :-) (not sure when it was added, but the creation, modification, and access times for the files on all 3 of my cPanel servers shows as today...). Surprisingly, I could not find this file documented or mentioned anywhere on the Internet, and still no mentions anywhere on how to fix this problem.


So the simplest way to fix the problem is to run
/scripts/clear_orphaned_virtfs_mounts --clearall
I did some digging and found that that script calls the function “clean_user_virtfs” in “/scripts/cPScript/Filesys/Virtfs”, so you can just clear one user directory “properly” by running
cd /scripts
perl -e 'use cPScript::Filesys::Virtfs();cPScript::Filesys::Virtfs::clean_user_virtfs("ACCOUNTNAME");'
All this script really does is unmount the home directory inside the jailed failed system (so the hard link is done through a “mount” that doesn’t show up in “df”... interesting). So the problem can also be fixed by
umount -f /home/virtfs/USERNAME/home/USERNAME

After this is done, a simple
/scripts/fixquotas
can be used to set the users’ quotas back to what they should be. Do note that this operation goes through every file on the file system to count users’ disk usages.
Facebook
Finally gave in and signed up

So I recently finally went ahead and got onto the Facebook bandwagon because people kept bugging me about it ;-) (among other reasons), of which I had been trying to stay away from since 2003 when Facebook didn’t allow me to register after I lost my college email due to dropping out.

While it is a nice open system with lots of things to do and some fine grained control, it has a lot to be left desired. While the privacy controls is kind of OK, it could be much much more fine tuned (and the most recent privacy update barely helped that out, if not making things worse).

The thing that has frustrated me the most though is trying to make my own Facebook applications. The documentation and its organization is TERRIBLE (often outdated), and there are so many functions the API just won’t let you do (a lot of stuff with photos, for example). The FQL (Facebook Query Language, like SQL) language doesn’t even let you do updates/inserts of information, just gathering of information (SELECT statements). One part that really pushed me over the edge on the decision to not work with it however is that it looks like it’s constantly being updated and “refined”, with aspects being added or depreciated so often, that it’s just not worth dealing with (though I have to admit I haven’t worked with it long at all, so don’t really have a good sampling ^_^; ). It’s no fun making an application and then having to reprogram it months later just because Facebook can’t get their act together. While reprogramming wouldn’t be a big deal, most likely, I have a thing about going back to old code and updating it unless I did something wrong or want to add a new feature.

While I would still recommend the site to people on general principal (not that everyone isn’t already on it), as it is nicely laid out with most things people need to stay in touch, it’s still not nearly as refined as it should be for the sheer user-base size and scope of the website.

Cross Domain AJAX Requests
Bypassing the pesky browser security model

Since I just released my AJAX Library, I thought I’d post a useful script that uses it. The function CrossDomainGetURL below uses the AJAX Library to make requests across domains in Firefox. It takes one more parameter (not in order) than the AJAX Library's GetURL function, which is an array of domains to pull cookies from for the AJAX request.


function GetCookiesFromURL(Domains) //Return all the cookies for Domains specified in the Domains array
{
	var cookieManager = Components.classes["@mozilla.org/cookiemanager;1"].getService(Components.interfaces.nsICookieManager); //Requires privileges, which is granted in CrossDomainGetURL
	var iter=cookieManager.enumerator, CookieList=[], cookie; //The object used to find all cookies, the final list of cookies, and a temporary object
	while(iter.hasMoreElements()) //Loop through all cookies
		if(((cookie=iter.getNext()) instanceof Components.interfaces.nsICookie) && Domains.indexOf(cookie.host)!=-1) //If a cookie whose host matches one of our domains
			CookieList.push(cookie.name+'='+cookie.value); //Add it to our final list
	return CookieList.join("; "); //Return the cookie list for the specified domains
}

function CrossDomainGetURL(URL, Data, CookieDomains, ExtraOptions) //See AJAX Library GetURL function. CookieDomains is an array specifying what domains cookies are pulled from for the AJAX call. 
{
	//Access universal privileges in Firefox (Required to get cookies for other domains, and to use AJAX with other domains). This functionality is lost as soon as this function loses scope.
	try { netscape.security.PrivilegeManager.enablePrivilege("UniversalXPConnect"); }
	catch(e) { return alert('Cannot access browser privileges'); }

	if(CookieDomains instanceof Array) //If an array of domains is passed to get cookies from...
	{	
		ExtraOptions=((ExtraOptions instanceof Object) ? ExtraOptions : {}); //Make sure extra options is an object
		ExtraOptions.AdditionalHeaders=((ExtraOptions.AdditionalHeaders instanceof Object) ? ExtraOptions.AdditionalHeaders : {}); //Make sure extra options has an additional headers object
		ExtraOptions.AdditionalHeaders.Cookie=GetCookiesFromURL(CookieDomains); //Get cookies for the domains
	}
	
	return GetURL(URL, Data, ExtraOptions); //Do the AJAX Call
}