swixel dot net

Playing by the rules(?)

Python + UPnP for M-SEARCH and NOTIFY subscriptions

Initially, I was given the impression that UPnP was “hard” to use.  That it had little exposure to Python, and that I should use C/++ to handle this.  When I started looking into the implementations across various languages I noticed a trend: they were largely binding a multicast port to listen for relatively simple transmissions.  While I’ve now seem some less than standard implementations (i.e. not 239.255.255.250 or not using port 1900), I’ve had little trouble playing with them.

This all started with a Yamaha RXV3900 being remotely controlled via the TCP/IP interface, and UPnP came into play with a Yamaha BD-A1010 being added to the network.  Here’s the code for anyone interested (notably this will run on Python 2.x as well as Python 3.2, but it was written for Python 3.2):

#!/usr/bin/env python3
import socket
import struct

# UPnP group + port
class UPnPListener(object):
	def __init__(self, UPNP_GROUP="239.255.255.250", UPNP_PORT=1900):
		# Build a multicast socket
		# See http://wiki.python.org/moin/UdpCommunication#Multicasting.3F
		sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
		sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
		sock.bind(('', UPNP_PORT))
		mreq = struct.pack("=4sl", socket.inet_aton(UPNP_GROUP), socket.INADDR_ANY)
		sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)

		# Store it
		self.sock = sock

		# Create device storage
		self.devices = {}

	# Internally used method to read a notification
	def listen_notification(self, data):
		upnp_header, upnp_body = data.split("\r\n\r\n")

		# Get header by line -- as fragments
		upnp_hfrags = upnp_header.split("\r\n") # Get lines

		# Ditch "NOTIFY * HTTP/1.1" ;)"
		upnp_hfrags.pop(0)

		# Storage
		header = {}

		# Fill storage
		for frag in upnp_hfrags:
			# Standards are helpful!
			splitpoint = frag.find(": ")
			header[frag[:splitpoint]] = frag[splitpoint+2:]

		# I don't need it, so I'll clear it here
		# -- I run this on a RaspberryPi, this is overkill elsewhere
		del upnp_header

		# Get the UUID
		if("USN" in header):
			uuid_base = header["USN"].find("uuid:") + 5
			uuid = header["USN"][uuid_base:uuid_base+36]
			if(uuid in self.devices):
				print("UUID Match: %s" % uuid)

	# Start listening
	def listen(self):
		self.listening = True

		# Hint: this should be on a thread ;)
		while self.listening:
			# Grab a large wad of data
			upnp_data = self.sock.recv(10240).decode('utf-8')

			# Filter by type (we only care about NOTIFY for now)
			if(upnp_data[0:6] == "NOTIFY"):
				self.listen_notification(upnp_data)

	# Register the uuid to a name -- as an example ... I put a handler here ;)
	def registerDevice(self, name="", uuid=""):
		if(name == "" or uuid == ""):
			print("Error registering device, check your name and uuid")
			return

		# Store uuid to name for quick search
		self.devices[uuid] = name


if __name__ == "__main__":
	# Create a default UPnP socket
	L = UPnPListener()
	L.registerDevice("RXV3900", "<your uuid here>")
	L.registerDevice("BDA1010", "<your other uuid here>")
	L.listen()

Have fun :)

couchdbkit authentication

I couldn’t find out how you’re meant to make couchdbkit authenticate against couchdb (for those of us who need to auth for production reasons). Anyway, here’s my solution using nothing but the deps (restkit is a dep of couchdbkit ;)) …

"""
Copyright (c) 2012 A.W. 'swixel'/'aws' Stanley.
Released under the MIT Licence or public domain (whatever you're allowed).
"""
# CouchDBKit import couchdbkit # RestKit from restkit import BasicAuth # Trivial "convert settings to couchdb database instance" # (Could be split into two parts if you wanted, but I use one DB per app) def ConnectToCouchDB(uri="http://127.0.0.1:5984/", database="testdb", username=None, password=None, design_docs="design_docs"): # Authenticate if(username != None and password != None): CouchDBAuth = couchdbkit.resource.CouchdbResource(filters=[BasicAuth(username, password)]) CouchDBServer = couchdbkit.Server(uri, resource_instance=CouchDBAuth) else: CouchDBServer = couchdbkit.Server(uri) # Get or create database try: CouchDB = CouchDBServer.get_db(database) CouchDB.info() except couchdbkit.exceptions.ResourceNotFound: # I let 'Unauthorized' be thrown here, you could catch it, but my 500 error does that ;) CouchDB = CouchDBServer.create_db(database) CouchDB.info() # Install views loader = couchdbkit.loaders.FileSystemDocsLoader(design_docs) loader.sync(CouchDB, verbose=False) return CouchDB

Optimising embedded Python: xrange vs range, and file

A number of video games at the moment are using Python as their scripting engine.  It bothers me a bit that so many of them are ‘fast’, and at the same time, use uncompiled code for things they shouldn’t.

One large company is using range.  It’s Python 2.6.  xrange is an iterator, meaning it’s equivalent to a C for loop (i.e. i++ for the index).  range is not, it creates a value for each entry in the range (‘start’ through ‘stop’).  xrange is fast, xrange uses less RAM, I’m not sure I need to say much more.  It’s on the Python website.  Here, have some references: [1] [2].

The second major gripe I have, which was discussed in a few IRC channels in the last week (Gamesurge and Freenode hosted ones, but not #python), is there was some a pretty shocking attempt we found reversing a game’s binary format.  When we got the Python code out of it, it wasn’t bytecode, it was raw text.  More importantly, it stored a filename value at the time, and then used __file__.  The reason the interpreter got upset was that it tried to use a full path for comparison and if anyone moved the files to a non-standard directory (as many of us do), things went south.  By the time we got a patch which removed this joke of a gamebreaking error, we’d already found it, so I thought I’d share that and anyone who suddenly found themselves able to play it in the last week or so, now you know what happened. (No, it wasn’t an indie game.)

Finally, don’t break the argc and argv values.  We shouldn’t be using them, but there are better ways to inject data into your interpreter to load new paths.  Either feed them argc and argv from your main function, or leave them blank.  Throwing random junk in is just weird.

The other option is to write real code.  It may be slower than using Python, or Lua, or Javascript, but you have the satisfaction of knowing it’ll be harder to reverse, will almost certainly run faster, and will prevent people from modding parts of games in which you don’t want us poking around.  Sure, you could actually think it through and use Lua or Python to inject tables of data (I do this in most of my little tools), marshal the data into lists (or something niftier like a std::vector or a std::map) and write sane code.

Abusing CouchDB for research purposes

Twitter research is something I thought I had overcome; something of the past, a relic of times and thoughts less well processed than those I like to think I have now.  Apparently not.

What is unfortunate about social media (at least from my research perspective) is the amount of easy to access, contextually ‘fixed’, data, lowering the number of variables required for analysis.  What is fortunate, is, well, what I just said.

To manage what I’m told is a ridiculous corpus, I built a CouchDB.  Last time, I took Tweets, reordered their JSON data so _id replaced the id field, then dumped everything except the user data into the database.  This was a HUGE amount of data.  And I mean huge.  It took me three months to load a 12.5% of all tweets in the May 2010 into a database, and used hundreds of GB of data for views.  It was nasty.

This time, I stripped everything except the IDs (user + status), the targets (user + status), text, timezone (text form), and UTC offset (which, is more important, but the timezone is pretty).  I didn’t keep the retweet field, as it’s missing in some, and not within the scope of what I care about.  The result is 531,159,273 tweets loaded in 3 days.  The views will take days, if not weeks (if not MONTHS) to build, but I’m limiting what I need, and starting with stale views to get preliminary findings.

Why CouchDB? Indexes, json to json, and the capability to not flip out on me when I do insane things.

The disk array is nothing flash: 4x2TB 5400rpm SATA-II disks running in RAID 10, attached using a round-robin configured dual gigabit NICs (PCI-E).  This is managed by Core2Duo 4400 (2GHz per core), with 4GB of RAM.  The box holding them does nothing but NFS share to the box running it.

The box running CouchDB is where I really cheat and move away from “commodity”: a Sun Fire X4600 M2 (so 8x dual core Opterons at 2.8GHz per core, with 32GB of RAM, and RAID1 10k rpm SAS drives).  Not since moving from CouchDB 0.6 have I come close to using more than 400% (4 cores for those less than familiar with Linux), and the disk i/o is holding — Python appears to be my slow point.

The real problem is that the insertions happen too quickly that the views freak out and CouchDB goes down due to RAM usage issues if I try to update while I insert … but other than that, it’s perfect.

As usual, my loading script is a multiprocessing Python hackjob.  I’d move it to C/++, but by the time I rewrote a socket proc-to-proc linking system, master+slave daemon configuration, the Python code would be done … and it’s not like I’m doing anything too far outside of what is already C modules for Python.

My own little world of JavaScript.

My parents bought a Philips Pronto remote control.  A TSU9600, to be precise.  They also bought an extender — an RFX9600.  The protocol it uses is largely unknown to me at this time, but wireshark and a little bit of port mirroring and that shouldn’t be a problem.

The Philips remote runs on a variant of Javascript (prontoscript).  Yes, the entire thing is written in what is effectively synchronous javascript.

Enter the Samsung Televisions they have.  Interestingly, both of these can have apps written for them in Javascript.  This time it isn’t something akin to, it is JS.

Finally, enter their mobiles — Samsungs, both.  Now, while Android applications can be written in Javascript (ugh), they have decent browsers, and with the power of jQuery and a box on the network, we have another avenue of Javascript glory.

Enter node.js.  Now, I’m ordinarily a Python hacker, or even C++, and, on occasion, an Erlang newbie.  However, as node.js is as close to the twisted of javascript I’ve ever seen, I thought I’d give it a whirl.  A week or so later, I’m playing with express.js and finding it not entirely irksome.

My idea, at present, is to setup some sort of gateway which takes Pronto commands as if it’s an extender, running as a relay.  Using asynchronous calls I can push data to the target device (or extender).  This will require some reversing, but should (hopefully) enable a few more homebrew solutions to problems, not to mention ways around lag on the remote itself.  My current aim is to reply my old squeezebox javascript blob with something that runs on a server —- passing JSON back and forth, asking the remote to do a non-blocking sleep while waiting for updates on the node.js instance.

I’ve achieved something similar with cherrpy and sockets (in Python), but JS<->JS<->JS should be more fun.  Anyway, the project gets fun after that —- hooking up the TV to music controls in the room it’s in…

Introducing: TrendyNode

What I have is a single javascript file, hinged on the use of nodejs.

It stores player information as JSON.

What it can do perfectly:

  • Let the player “Load” data;
  • Rotate the saves on save request.

What it can do well enough to work:

  • Authenticate;
  • The heartbeat;
  • Begin the save request.

What it present fails to do:

  • Save the data — requires additional reversing.

What it will never do:

  • Circumvent VAC;
  • Replace TrendyNet (this is for LAN use with achievements on).

Authentication is largely a mystery, but I have a workaround in mind if I cannot work out how part of the auth function works — IP restrictions based on Steam’s OpenID (for the internet) or using IP restrictions on LAN (SteamID still applies here, once logged in, it’s locked until an admin resets it, or 24 hours pass).

I’m also mishandling authentication data at the moment — mostly because I developed this on a ‘system-by-system’ setup, restricted to localhost usage.

If anyone has questions/comments, let me know … it’s not like this system is complex, though the binary is seriously annoying due to the use of std::basic_string.

    Dungeon Defenders Servers: Steam and TrendyNet

    Generally, I’d hold this back, but having received an infraction for being blunt, and a bit of abuse through member emails, I figured I’d be a little more blunt.  Some points:

    First:  Trendy doesn’t let you use any tools to modify your game.

    Second:  Trendy apparently actively bans users.

    Third:  Wireshark is free and lets you look at insecure traffic.

    Fourth:  Objdump is awesome ; failing that dumpbin will do.

    Five: ‘cloudapp.net’ is owned by Microsoft for Azure.

    Things you can find out knowing these points:

    • Steam is used for matchmaking (try: dumpbin /imports:steam_api.dll DunDefGame.exe);
    • TrendyNet is hosted on Azure, and uses HTTP traffic (not even HTTPS);
    • TrendyNet hosts all save data, Steam does not;
    • Saves are not feasible using either Trendy or Steam, the way you could script one is quite simple, but to prevent abuse, it’s not sane;
    • Apparently suggesting an eight year old with Wireshark can detect Steam, and a ten year old could detect (and reverse) the HTTP used for TrendyNet is insulting (note the fairly trivial introduction video to wireshark, and that there are more).  Update: I got an 8 year old to prove me right…

    Yes, I have my own TrendyNet backend and decoupled my legitimate install from Steam, without once modifying the game binary …

    Brief update: No, I won’t be giving away my own TE backend code at this time.  It was an experiment I did for research purposes, so I could better help people having issues, not something to be done to enable hacking.  That said, given the average reaction to things, and my assumption of an average level of intelligence/understanding/knowledge, I’m pretty much done with that.

    Update: As a note, this is Phase #1.

    Dear AMD …

    Dear AMD,

    You have APUs which are, quite frankly, neat.  They make consoles look terrible, and you could easily push an AMD64/x86 console out every 2 years at a retail price of $1000 (AUD, not USD).  You could make a platform which developers could actually target, hook into distribution platforms (like Steam) to do your handy work, throw in an OEM copy of Windows 7, something basic like “MS Security Essentials” for the anti-viral load, keep keyboard/mouse flexibility, but restrict the hardware.

    The end result would be games on a platform which isn’t hopelessly outdated in 6 months, but a platform which is accessible and developers could be reasonably expected to support and target.  Throw in some sort of neat little sticker programme: “Certified for the AMD Console MK IV”, and you have yourself a winner.

    What would the internals look like? A laptop — just add cooling, some glowing bits, and the ability to interface with a TV (oh wait, HDMI does this for you).

    Regards,

    A nerd (who is sick of bad game ports and shoddy code).

    Scripting languages and Interpreters

    Recently I’ve been taken with the distinction between scripting languages and their interpreters.  Some are fluid, adaptive, and enable all sorts of magic and trickery that I particularly enjoy abusing.  On the other hand, there are the rigid kind, which usually feature explicit casting.

    I should stop reading compiler design books, but this sort of stuff is just amazing.  It’s a bit beyond what I usually like doing, but I can see applications …

    Modding and people

    Invariably, people don’t appear to understand the effort taken to modify a game, unless of course they are the ones doing it, or have a similar (or better) level of understanding.

    Various games bring about people who irk me, but more often than not those who might appear to be culprits of it aren’t, and those who drape themselves in a cloak of understanding often don’t.  This reversal is probably more infuriating than anything else because the subjects are either long-time franchise fans or computer scientists, who assume that because it’s the ideal model that it’ll work (or worse, they don’t understand the commercial forces which drives the choices, even though they’ve studied that too).

    I could draw a line in the sand between myself and these people, but in so doing I would be cutting myself off the community as a whole, and I would be modding for nobody but myself (which, frankly, would make it much easier).

    Tropico 4 isn’t necessarily one of the worse communities, but there is certainly a level of entitlement which rivals Team Fortress 2.  The problem with entitlement is that players believe that the publisher or a modder should add/fix/remove something because they think it fits within the scope of the game.  Were it necessary, the developer would either add it or possibly DLC it (see maps in the COD franchise).

    I’m not saying that everyone out there with their hand out is an idiot, or that they need to have a deep understanding of the game, but many of the complaints aren’t well thought through.

    Aside from modding, you then have the people who make you wonder about humanity.  Those who have tried everything but removing anti-viral programmes, insisting that an application designed to peer and interfere with other applications couldn’t possibly be causing that fatal cross-thread deadlock that plagues every game on the same engine.  You’ve got the people who didn’t read the box/site/page/forum/billboard/poster which screams that their system is inadequate for the game they’re about to buy.

    You then have the additional “know-it-alls”, and I don’t mean those who are part way through the code trying to unravel the magic (read: people like me), but the people that rave about X, Y and Z, features, being dead wrong about one or more of the things.  Supplemented from time to time with those who want to know every secret to “perfect” their experience and get the best score (why would we give that away?).

    To top it off, you have people complaining on various forums about delays and “unhelpful” staff on forums/lists, without registering that maybe, just maybe, there is a reason they’re not giving away the secrets of the game or that they aren’t responsible for the delay but under contract can’t make a comment about the delay.

    Finally, there are the people who are trying to be helpful, and correct things that weren’t wrong in the first place.  Anachronism has no place in video games, save, perhaps, to demonstrate the path from which the franchise or system has grown.  Want features from that game? Go play it.  The incremented number at the end of the title says one thing to you: this isn’t the game you’re talking about, it’s a new one.  Yes, it looks the same, speaks the same language to you, and probably does many things in the same way.  The differences in the game are not that big, I hear them complain, saying that a new game wasn’t worth it.  Apparently the difference between your frontal lobe and that of a low-tier primate aren’t either.

    Entitlement is the sum of everything wrong with the gaming industry.  While you may have a product to move, you can’t bow to people completely or you end up with a fan base expectant on your compliance.  (This post is, of course, to say nothing of the people who want “modkits” for engines — but if you want to read it that way, go to the beginning and start with that in mind, it probably works equally as well.)