What to do when you want to distribute a python solution through pip but you only have a Subversion server? You can turn your code into a package and ask pi to kindly use your svn server as a trusted source. This text describes a way of doing exactly that with minimal configuration and avoid bothering your busy build engineers.

This piece covers how to do the packaging manually. cookiecutter would be another option but seems overkill for what I want to do. The only dependency of note is a web-browsable Subversion repository or any index based web server.

Why using packaging internal use tooling?

If you’re extremely lucky all your code executes on libraries contained in the base Python distro. Congratulations. You can distribute your solution by email if you want. But perhaps you want to be able to keep some form of versioning, or expose sensible entry points, among other things.

I arrived a this problem while developing an internal tool for a team of sound designers working on Wwise. I was virtualenv-ing my way around the development but after a couple dependency installs I started thinking about distribution. I considered the classic requirements.txt included in the sources and ask the guys to pip install -r requirements.txt but somehow that solution feels like it belongs more to a CI/CD enviroment than to end-user distribution. Not to mention that you’re asking your end users to sync your sources and perhaps you don’t want that.

Then there’s the problem of executing the tool itself. There is a difference between:

1
python cli_amazing_tool -a foo -b bar -c aux

and

1
cli_amazing_tool.exe -a foo -b bar -c aux

And I had the added problem that my solution was bound to a specific version of an internal library, also written in Python. That library was under heavy development and mantaining matching version was fundamental for my sanity.

Python’s packaging system can take care of all this with ease. With just one file.

Setup.py: configuring a Python package

First things first, the documentation for the setuptools is here. If you skim the documentation for the good stuff you’ll see a couple of almost ready-to-be-used configurations.

The content of an extremely basic setup.py file could look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from setuptools import setup, find_packages
setup(
name="cli_amazing_tool",
version="1.2.3",
packages=find_packages(),

entry_points={
"console_scripts" : [
"amazing_tool = cli_amazing_tool.main:main"
],
},

install_requires=["waapi-client==0.3b1"],
author="jcbellido",
author_email="jcbellido@jcbellido.info",
description="A waapi-client based tool",
keywords="wwise WAMP waapi-client",
project_urls={
"Documentation": "http://confluence.jcbellido.info/display/DOCS/cli+amazing+tool",
"Source Code": "https://your.svn.server.net/svn/trunk/sources/cli-amazing-tool",
},
)

As you can imagine packaging is a big problem, that’s why we have build and release teams. But in the case of the lonely developer with a shoestring budget this approach can do perfectly. There’s a couple tricks in the previous configuration:

  • install_requires: This is the key feature for me. pip will take care of the package dependencies through this list.
  • find_packages=find_packages(): this is the auto mode for setuptools packaging. As far as I understand it, it acts as a crawler and adds every package (ie: anything with an __init__.py) to the final .tar.gz. In my case this includes the tests but honestly I prefer it that way. Has been useful a couple times.
  • entry_points: When defined pip will create .exe wrappers for your packages. This example is overly simplistic. It should be trivial to create meta-packages that expose a suite of related commands.

Package Generation

Once your setup file is ready, from the project root:

1
python setup.py sdist

This command will take the package definition contained in setup.py and pack everything under a tar.gz file. In this case, something like cli_amazing_tool-1.2.3.tar.gz that’s the file you must push to your repository.

Something that I obvserved is that the command complains about a weird dependency after a change to setup.py. Before worrying, delete the .egg-info directory and reexecute your setup.py, it worked for me pretty much every time.

Installing on user machines

Once your packages are submitted to your repository and if you’re lucky, your IT department would have pre-installed Python in your users’ machines. If that’s not the case you can always install Chocolatey and ask the guys to install the dependencies themselves, actually I tend to prefer this way. This opens the door to even more control on the execution environment of your solutions but it’s not the point of this text.

Once the interpreter is installed you just need them to execute something like:

1
pip install cli_amazing_tool==1.2.3 --trusted-host your.svn.server.net -f http://your.svn.server.net/svn/packages/something/cli-amazing-tool

… a command that can live perfectly in a powershell script.

If you pay attention you’ll see --trusted-host your.svn.server.net this could help you if you don’t want to use HTTPS, perhaps your local svn server ain’t configured to use it. Perhaps you don’t want to hustle with server certificates. It’s an option. Not recommended but useful.
The -f option just adds a new source to pip.

Profit

Once the first loop is done and your users can painlessly install and update their tools you’d have reached a form of parity with more compily languages. Having your code contained as a package will help you if you decide to go CI and it simply makes things clearer in the long run.

For me there’s one more step to take, though. The full packaging: every dependency included in a single redistributable file. I read about a couple options like shiv that seems to do what I need. But that’s material for another text.

Bellido out, good hunt out there!

/jcb

Comment and share

During the last months I’ve been involved in an infrastructure project. The idea is to offer on-demand resources. Think Jenkins or GitLab or any render queue. In my case, users are working from different countries and time zones. This is one of the cases where building a web-based front end makes sense.
The challenge: I’ve never built anything mid sized on web, only micro solutions that needed close to zero maintenance and were extremely short-lived. To make things more interesting the backend was offering its services through gRPC.

A note for other tool programmers

This is a piece about my second project using react. The first one, even if functional, was a total mess. I’m not suggesting the approach contained here makes sense to everyone but it has worked for me and I think keeping it documented has value.

The main issue with web-stuff for me is the amount of thingies you need to juggle to build a solution. To name just a few, this project contains: javascript, react, Babel, JSX, gRPC, Docker, Python, CSS, redux and nginx. It’s surprisingly simple to drown in all that stack.

Starting: react-admin + tooling

I needed an IDE for Javascript and I didn’t want to consume any license from the web team. So I started with Visual Studio Code. Coming from an overbloated VS Pro the difference in speed and responsiveness is remarkable. Adding the javascript support was also quite simple using a Code plugin. Below it, I had a common npm + node installation. For heavier environments Jet brain’s WebStorm IDE is what the professionals around me are using more frequently.

From that point a simple:

1
2
3
npm install -g create-react-app
npm install react-admin
create-react-app my-lovely-stuff

will get you started. You can see a demo of react-admin from marmelab team here:

With all that in place, how to start? After checking with more experienced full-time web devs they recommended me to use react-admin (RA from now) as a starting point. Later I realized how much RA’s architecture will impact the rest of the solution. But as a starting point it is great. The documentation is really good, I learnt a lot from it. From the get go you’ll have a framework where it’s easy to:

  1. List, Show detail, Edit and delete flows
  2. Pagination
  3. Filtering results
  4. Actions in multiple selected resources
  5. Related resources and references, aka: this object references that other thing make navigation between resources, simple.

Half way during the development I found out about react-hooks. I strongly suggest to watch this video, well worth the time I put into it:

I used only only a fraction of the potential Hooks offer and that was more than enough. The resulting code is leaner and more expressive. If I need to write another web using react I’ll try to squeeze more from them.

RA is based on a large number of 3rd party libraries. For me the most important 2 are:

  1. React-Redux: I use it mainly in forms and to control side effects. Some of the forms I have in place are quite dense and interdependent.
  2. Material-ui: Controls, layout and styles. According to what I’m seeing around lately it has become an industry standard. Out of the box is going to give you a Google-y look and feel.

Unless you’re planning to become a full time web developer I don’t believe it’s particularly useful to dig too deep into those two monsters of libraries. But having a shallow knowledge of the intent of the libraries can be quite useful.

gRPC in the browser: Envoy + Docker

The backend was serving its data through a gRPC end point and was being built at the same time I was working on the frontend. One of the main concepts of gRPC is the .proto file contract. It defines the API surface and the messages that will travel through it. Google et. al. have released several libraries to consume gRPCs (based around that .proto specification) in many different programming languages including Javascript, .NetCore or Python.

But the trick here is that you can’t directly connect to a gRPC backend from the browser. In the documentation, Envoy is used to bridge those. In other scenarios it’s possible to use Ambassador if your infrastructure supports it.

Since the backend was under construction I decided to write a little mock based on the .proto file using Python. Starting with the .proto file I’m returning the messages populated with fake but not random data. The messages are built dynamically through reflection from the grpc-python toolset output. The only manual work needed is to write the rpcs entry points than are automatically forwarded and answered by the mock.
Once the fake server is written you still need to make it reachable from the web browser. It’s here where docker-compose made my life way simpler. I wrote a compose with envoy and my server connected and I had a reliable source of sample data to develop the UI. In this case I was lucky since my office computer is running on a Pro version of Win10 making Hyper-V available and the Docker toolset for Windows machines have improved a lot lately.
It’s perfectly possible to achieve similar results using non-pro versions of Windows or even simpler by using a Linux or Mac desktop.

This small solution turned to be quite important down the line given the amount of iteration the backend went through. In the web world there’re many great API / backend mocking solutions based on REST calls. But when you’re working with gRPC the ecosystem is not as rich (or I didn’t found anything mature at that moment)

Other lessons

One of the interesting side effects of using RA is the impact of the dataProviders abstraction. The whole architecture orbits around classic HTTP verbs. At the end most of my code beyond some specific layouting and extra forms was pure glue. I have full translation layers in place: from gRPC into Javascript objects and vice versa.

In my domain and due to API restrictions I was getting different categories of resources through the same gRPC points. After thinking a bit about it the simplest solution I found was to implement pre-filtered data providers and give them resource relevant names. In other words I ended with a collection of data providers that were internally pointing at the same gRPCs but with relevant names. This allowed me to offer meaningful routes while keeping the UI code isolated from the backend design.

Containers, Docker in my case, are becoming more and more important as I go forward. If you know nothing about them I strongly suggest you to put some time in them. It can be a game changer. Even if your intent is to keep your dev environment as clean as humanly possible.

Comment and share

During the last 16 / 18 months I’ve been working primarily with Microsoft technologies on the Desktop. A big lump of: WPF + OpenXML + Entity Framework. In other words: big stacks, massive code bases and tons of hours trying to understand what is going on under every:

1
2
3
using( var context = new DbContext() ) { 
var stuff = await context.Thangs.Where( w => w.Foobar < 3 ).ToListAsync();
...

.. block in my code.

I felt a little bit saturated. I wanted a project on the side, something interactive. And that’s how I found godot an open source game engine, an all-in-one package.

Getting engine + tooling

This game engine was born around 2007 and it’s been in development since them. The project got a MIT license at the begining of 2014. The mainline today is on the 3.0.5 version and yes, there’re versions for Mac + Linux. And just to make things even simpler, you can fetch a precompiled godot from Steam. It doesn’t get simpler than that.

It’s also possible to build the engine, that includes the tooling, from code, even though it’s not the simplest distribution system I’ve seen. The “Compiling” documentation includes several step by step guides that worked well for me.

If you’re working under Windows you’ll notice that he size of the .exe is around 20MBs. That’s all, that includes both the environment and the runtime. The editor, opened looks like this:

If you’re interested in testing the game in the image, you can try to play it in a browser

As usual if you’re planning on releasing in different targets, like iOS or Android, you’ll need the SDK and the size may vary. At the moment there’s no official support for consoles.

Learning Godot engine

An interesting way of approaching this technology, is to check some projects. Luckily there was a game jam hosted in Itch.io: godot temperature game jam quite recently and the projects submitted are interesting to play and check. It’s possible to download the sources and build the games by yourself, most of the titles I checked host the sources in github.

Godot architecture and code base makes it well suited for teaching and starting in gamedev. It’s possible to devevelop new behaviors using the internal language GDScript.

It’s also relatively simple to find YT playlists covering the basics of the engine, one example, found in Game From Scratch’s YT channel, could be this one: Godot 3 Tutorial Series

I know there’re a number of online courses, in the shapes of Patreon’s + Online Uni’s, etc. But I don’t know enough about those to have a clear opinion or.

Meanwhile, in the world

And now for something completely different: while I was deep inside one of Microsoft’s tech stack the guys’ve been busy and we have new nice and neat toys:

Blender is looking better than ever and it’s approaching 2.8 at the wooping speed of a second per second. Perhaps this video could help you catch up:

.. fantastic work.

Cyberpunk 2077 has a new trailer after years of silence. There’s quite a lot to write about CD-Projekt, timing, marketing, and whatnot.

.. but for now, it’s enough to say that I might have some part on the behind closed doors demo in 2018’s E3.

Battlefield V seems to be, somehow, advancing in time and the team travelled from WWI, into WWI + 1, or, in a trailer:

.. which, as usual, looks espectacular.

Comment and share

During the last weeks I’ve got the request to write some documentation of the localization tech stack I’ve been working during the last 18-ish months. In the team I’m working with nowadays, there’s a group of specialized documentation writers. Tech writers.

And when you check the docs they create, it’s clear they’re professional. Unified styles, neutral English, linked documents, different sorts of media including images, gifs, videos, links to code, examples in the game … everything you can imagine. It looks and it is costly.

And that works well for teams of some size. Let’s say sizes over one person. I’ve been driving aboslutely every aspect of the stack by myself: DBs / Caching / Services / UI / Exchange formats. On two very different projects at the same time. Starting from scratch. It’s been a blast. But it’s a messy blast.

How it should look, for me

When consuming documentation I want 2 sources of information:

  1. As a final user of the stack. What does the user see? How does the UI work? Which are the metaphores deployed?
  2. High level architectural view of the code base. Server based? Service based? Local user only?

… and, once what the intent is clear and the language with the user base is defined, then, if possible, show me some unit cases. Nothing fancy or spectacular something to start tweaking here and there.

That would be the gold standard.

Then, obviously it’s better when the code is not rotten. But that’s a daily fight. And a different discussion.

So what’s next?

Umh, after the E3 mayhem, maybe I’ll be able to convince some producer to redirect the work of some peers at QA to work with me for a couple weeks, and we’ll go together through all the insane nooks and crannies that one-man-operations tend to generate at these scales. If I’m lucky this person will be able to create some end user documentation and we’ll discover some easy points for improvement.

Meanwhile, obviously, I have even more stuff to develop, including a nasty data migration, related with a deep change in our domain.

Oh, the good ol´times when I believed that running Doxygen and flee was enough.

Comment and share

Fair warning: This articule is very technical and I’m not an expert on language analysis, I’m just a programmer with a problem to fix.

Extending strings in Localization

Think about the last time you played an online game. A competitive FPS, for example. The match ends, there’s a winner and the game displays:

Player xXxKillATonxXx wins the match

How can the game developers know in advance that Mr. xXxKillATonxXx was going to play? That’s either a string concatenation or a string substitution and sometimes game devs opt for the second choice. This means that in the source of the game we’ll have something in the like of (let’s not get in the discussion of if this is a good solution):

1
ID_WINNING_GAME = "Player {playerName} wins the match"

See it? that’s hell for QA. If you have, let’s say, 18 different text langs to localize: you need to be sure that those curly braces match, that the variable name “playerName” is correctly spelled on every language, and so on and so forth. That’s a reasonably easy problem to solve using RegExes but what happens when the UI team goes really wild and they allow something like this:

1
ID_WINNING_GAME = "<red>Player</red> {playerName} <blue>wins <italic>the</italic> match</blue>"

… well, in that case you don’t have a simple string anymore you have a DSL which is a way more complex problem to solve. And, from the QA perspective, it’s more difficult to track.

So at this point we have a combination of tags, variables that can be nested indefinitely, in a process that it’s incredibly error prone and very difficult to catch by eyeballing strings. It’ll also end in broken strings on screen during runtime, and that’s a risk for Certification.

And don’t forget that, due to grammar, different languages might have the tags in different places and maybe in different orders. The only rule is that any localized version should have the same tags and structure (in terms of tag nesting) than the source language.

Parsing DSLs, enters Pidgin

Facing that problem I had two alternatives: either program a recursive parser that’ll chew the strings and tokenize them properly, or use a more formal approach, in this case through Pidgin. The documentation of this library is pretty good and the test and samples folders contains a plethora of good small snippets that you can use right away.

So, let’s dig into this problem a little bit. For simplicity, I’m going to reduce the scope to single format strings that can be nested as much as we want, so let’s begin with the basics, let’s consume innocent strings:

1
2
Parser<char, string> Fluff = from f in Token(c => c != '<' &&
c != '>').ManyString()

simple enough, right? A call to Parse with that Parser will consume anything that doesn’t contain < or > and will be flagged as Success. On top of that Fluff also accepts empty strings.

We can make our lifes a little bit simpler by adding a bunch of simple parsers:

1
2
3
4
Parser<char, string> LT = String("<");
Parser<char, string> GT = String(">");
Parser<char, string> Slash = String("/");
Parser<char, Unit> LTSlash = LT.Then(Whitespaces).Then(Slash).Then(Return(Unit.Value));

so we have the basics of the language right there, LTs, GTs, slashes .. all the components. Let’s aim for something more complex, the tag Identifier, where we impose that the first element has to be a letter, in glorious LINQ like:

1
2
3
Parser<char, string> Identifier = from first in Token(char.IsLetter)	// "Token" makes this parser return the parsed content
from rest in Token(char.IsLetterOrDigit).ManyString()
select first + rest;

… we’re ready for consume a full string that starts with a format marker and ends with the closing of such format marker, something like this will do:

1
2
3
4
5
6
7
8
9
Parser<char, Tag> FormatTag = from opening in LT
from formatLabel in Identifier.Between( SkipWhitespaces )
from closing in GT
from body in Fluff // !!! Attention here
from closingTokenOpener in LTSlash
from closingLabel in Identifier.Between( SkipWhitespaces )
from closingTokenCloser in GT
where ( formatLabel == closingLabel ) // we assure that we're clossing the correct tag
select new Tag( formatLabel, body); // Let's imagine that you have this defined

If we’re lucky enough and the string that we need to parse is surrounded by a single format marker, this piece of code will take care of it and return a “Tag” object. That we’ll be able to compare and consume later.

But that’s not what we want to solve, we should change that call to Fluff for something that can potentially consume more tags that live embedded in the string. Also, we need to take care of a string that starts and ends with normal text and happens to have a Tag in the middle, let’s do that now:

1
2
3
4
5
Parser<char, Tag> tagParser =
from preFluff in Fluff
from items in Try( FormatTag )
from postFluff in Fluff
select items;

see that try modifier? That’s what enables the parser to backtrack in case of failure. In essence you don’t “lose the input” and you can use other rules. Incredibly useful. But still, we can’t consume several of this rules, let’s fix that now:

1
2
3
Parser<char, IEnumerable<Tag>> stringParser =
OneOf( Try( tagParser.AtLeastOnce() ),
Try( Fluff.ThenReturn(null as IEnumerable< Tag > ) ) );

That needs some unpacking:

OneOf accepts a sequence of Parser and will try to execute them in sequence for left to right, once one consumes input that one is selected, otherwise it fails. In this case we’re trying to either parse a tag or simple and innocent text.
At least once executes the previous parser one or more times and accumulates the output into an Enumerable container.
ThenReturn Let’s you return whatever you want once a Parser has completed succesfully, in this case we need to change the output of fluff from string to IEnumerable Tag. At the end, the goal is not to know what the string contains but just to ensure that the structure remains between different languages.

So, going back to our “FormatTag” Parser, we need to tweak it a little, with:

1
2
3
4
5
6
7
8
9
Parser<char, Tag> FormatTag = from opening in LT
from formatLabel in Identifier.Between( SkipWhitespaces )
from closing in GT
from body in stringParser // <<<<<<<<
from closingTokenOpener in LTSlash
from closingLabel in Identifier.Between( SkipWhitespaces )
from closingTokenCloser in GT
where ( formatLabel == closingLabel )
select new Tag( formatLabel, body);

And there we have it, nested strings, embedded indefinitely, with your memory as the only limitting factor in this solution.

This is, of course, an incomplete solution. But it covers the main points of the grammar in place: recursion and tag verification.

Some lessons

  1. Recursive grammars become incredibly complex to parse. Using TDD is a must.
  2. Chop, chop, chop your problem. Every parser should do the absolute minimum, combining cases is the shortest route to failure and headaches.
  3. Test for End() Sometimes the strings are empty or you want to check that you’ve consumed the whole input.
  4. OneOf + Try is a patter on its own. The library might have something more compact, but with my knowledge, I like to use it.

Not data driven, but flexible enough

One of my few regrets with this solution is that it’s not completely data - driven. Other 2-step solutions would’ve been more flexible. Imagine a grammar description in an external file that it’s compiled in runtime and ends in an in memory parser that you can use as you please. That’d been way cooler, but also more complex, at least with my current knowledge of this libraries and technologies.

Comment and share

After so many words about the current state of my country, what to say next? What’s going on?
It’s particularly difficult to explain. Let’s take Mr. AteoTube for instance. Italian. In his 30s. Do you think he can “get it”? Nope … it’s challenging, and requires such amount of dedication that it’s ridiculous.

It’s been two very intense weeks. Politicians sending letters to each other (a legal recourse, obviously) veiled comments, strange maneuvers. And, well, a probably planned crisis that will last long. The first estimates are pointing to the 6 months mark. 6 months: of companies moving their headquartes and tax return location.

Companies

One of the main points of the separatists was that the companies, the biz, the hard cash. Will remain in the territory, given the vast superiority of the to-be-republic. Well, that has been proven to be, not exactly true. As in completely false.
At this moment 2 main banks, and hundreds of companies have move their “fiscal homes” to other territories. Madrid, Valencia, Alicante, Mallorca, Zaragoza and other cities are the new places of the bussiness.

So, that’s one. No reasonable company wants to abandon the EU in favor of a completely unexplored legal ground. They have interests to protect.

Attrition

We don’t know the intent of our (current) president. What is he up to? What’s the plan? We’ll see when it’s done. And that’s not something I particularly like. We’re not kids. We deserve to know. What is the state going to do?

What it’s clear is that this situation is a war of attrition. At this moment we don’t have trenches or violence. But we’re going to endure this long standoff. This makes no sense, it’s painful and will damage Spain, and Europe. This is our current president’s motto: “to victory through lack of action”.

I’m incredibly tired of this

Comment and share

The last weeks have been intense in many ways. So let’s get into detail.

Weaponizing cat GIFs

My family is big. At one point I had around 12 uncles and aunts. With their SOs and children, dogs, hobbies and all the rest. One of my aunts decided to create a whatsapp group and she added the whole bunch to it. It was a good idea. Family matters more than pretty much anything else. However there’s a catch. Given the limitted topics that can be discussed in such a narrow group of people [1], this conversation tends to follow some predictable patterns:

  • They send pictures of their children.
  • Weather comments are encouraged.
  • Random meme forwarding, mainly covering local politics are, ermh, tolerated.

And. Well, I think I get it. But this group lacked my voice. What can I say? In Warsaw the weather is nice? Or, I solved an issue in a massive entity framework codebase? Nah, not really. It’s a weird position to be in. I want to take part. But I don’t believe I have any reasonable input to these people. And then, … everything became obvious. Let’s use the hive mind, let’s … become the interneCs. And there’s just one way of doing so. By sending cat GIFs.

And this is what I learnt: you can stop any conversation in any digital media by a well placed cat GIF. Simple. Effective. Evil. And this deserves some deeper analysis. By doing so, are we improving the world? [2] Well, perhaps; who am I to answer such a deep question. What I know for sure is that it works. When they know that any BS is going to be answered with some brief movie of a cat doing catty things, somehow they loss some of their motivation. And this makes everyone lives a little bit better.

The dark ones

For the last 2 years I’ve been following the US campuses news with interest.

You must understand that I spent my university years in a tech institute. 95% boys. No beer allowed in the campus at all. It was just us, and advanced calculus. During that time I grow completely dettached of anything human[3]. Only the machine was relevant. What will it compute? What’s the intent of this piece of code? What are we trying to achieve?

During those years, I never, and I mean it, I never thought about the lack of ladies. It was the situatcja. It was reality, nothing you could argue with. 15 years after that I was hired by DICE and a new period just opened. There, the situation was different. The lady proportion was higher (but not by much)

And here is when everything started.

I was introduced to the world of the Politically Corret[5]. It was, and it is, a battle on language. Reality morphs. Timeless concepts are no longer valid. What you believed it was reasonable it’s now dead, unreasonable and bigoted. And that, … that was a shock. But … what the hell was happening?

  • Knwoledge evolves, get up with times.

This is true. I guess. Dunno. We’re just newcomers in this life race[4]. But the problem with language is that any change propossed to it is a blatant challenge on the individual conception of the world. Through this lense, accepting new terms is not exactly a cheap matter.

On the other hand, as we advance and get a more nuanced vision on everyday issues, the appearance of new words is, in essence, unavoidable.

  • Agendas, everywhere.

And here comes the problem. When you’re exposed to a new wording of a known fenomena. What’s happening? Well, in my view the problem arises with the intent of the person that is introducing this new concept. A reasonable question to ask is “what’s the intent of this individual?”. And here is where everything comes apart.

  1. The issue with feelings/awareness.
    And here is where my problem begins. I guess. But, how much data do they contain? And, again, the answer is “not much” really. More in the future.

  2. Working with new facts:
    This is the real double whammy.
    When facing new realities there is no darkness is just[6] us and the new reality. It’s painful. But at the end of the day, better for everyone. But, my question here is: how often can a new fact be uncovered? Because my intuition is that this could happen only on particularly extreme cases. In this light I tend to revert to case 1.

Smurfling songs

A couple years ago, during Christmas. In a strange Saturday morning mood. I decide to make the best kind of present to my close friends: a smurfled version of a particularly sexist song from the Spanish 60s:

  1. Mainly for civil reasons. I mean, everyone has that weird-brother in law. And I’m not different that anyone, that’s for sure.
  2. I discovered a couple years ago that the vast majority of the rethorical questions are supposed to be answered “no”.
  3. Yes, it was like become a SEAL but with an HP Unix mainframe.
  4. A reference to the red queen race, btw.
  5. At that time I discovered the idea of “feelings” in discourse.
  6. Me.

Comment and share

This entry is an iteration on Python Multiprocessing During last week I found an opportunity in my code: process the output of a long transcoding in a background worker. The catch here is that, said transcode, generates chunks of work. I can listen to its evolution and process every chunk individually, while the main transcode is still in progress.

The code can be found here: Queue processing and you’ll find something that resembles a test suite here Driving the queue processing

So these are my requirements:

  • I want the whole strcuture to be encapsulated behind a class. Facading might be a good way of describing the approach. After the object creation, a new process is spawned and it peacefully waits for tasks to do.
  • Sending a task to the background worker is non blocking to the caller. No matter if the worker has crashed.
  • When the main process has finished I collect the results from the background. Such results must match the task objects.

So, in a way, I’m mapping a list of tasks that is being populated little by little. I’m sure there is a better way out there to implement this pattern.

Code walking

The basic usage looks like this:

w = queue_consumer.WorkingProcessFacade()
w.process_task(queue_consumer.SimpleTask("one", 1.0))
w.process_task(queue_consumer.SimpleTask("two", 2.5))
tasks_result = w.collect_tasks()

for t in tasks_result:
    print str(t.ID) + " - "+ str(t.result)

The WorkingProcessFacade will spawn the background process on creation. Obviously what the task is supposed to do is an implementation detail, here I’m using the float as an estimation of the background processing. Calling collect_tasks signals the end of the job: terminates the worker, closes the queues, and changes the state of the working process facade, no more tasks will be accepted.

def process_queue(console_mutex, input_queue, output_queue):
    print_mutexed(console_mutex, "Started, waiting for tasks")

    task = input_queue.get()
    while task is not None:
        if task.ID.lower() == "crash": 
            raise Exception("freak out")
        print_mutexed(console_mutex, "Received: " + str(task.ID))
        time.sleep(task.estimated_processing_time) # simulates the task
        task.result = str(task.ID) + " solved after " + str(task.estimated_processing_time)
        output_queue.put(task)
        task = input_queue.get()

    output_queue.close()
    print_mutexed(console_mutex, "Termination received, ending worker process")

Since this code is mainly for trainig, I’ve taken some liberties. For instance, naming a SimpleTask “crash” will kill the background worker. Also, I’m sending an explicit terminator through the queue, so the process gets notified of the end of the work. An alternative would be to use close on the communication queue, I might take a look into that during the upcoming days.

When things go wrong

From time to time the background workers, just die. Exceptions are thrown, resources are unavailable, or a million other things could go terribly wrong. I’ve tested this solution against:

  • multiple calls to collect_tasks
  • background worker untimely dead, the Facade shouldn’t deadlock

What is missing

Relying on the user code to call collect_tasks might be a little bit risky. Maybe using a with pattern, or writting a custom destructor might be useful. So, if you use this implementation, keep an eye open for the usage pattern.

And also: Bonus points for mutexing the console.

Comment and share

The code I’m going to discuss a little bit can be found here Multiprocessed PyPing that, btw, probably violates like a gazillion conventions, so use with care.

Lately I’ve been playing with raspberrys at home. They’re funny little machines where you can host your local git, they work as a simple file server you can play movies on them, … even program a little bit if you’re brave enough.

In an impulse I bought the Raspberry Pi 3 Great machine. I hooked it to the TV and did the initial configuration, tested the gcc compiler, configured the WiFi and then I placed it in a corner of the living room. But I didn’t thought about assigning it a fixed IP. There it was, an inscopicuous black box, completely inacessible.

Given that lately I’m using Python for everything I thought about automating this task of listing the devices in my network that answers to ping. I found this library: Pyping, and after unpacking it, and extract the code, I tried to list the network one address at a time. And it was slow, and boring, and a little bit miserable.

And here enters: Multiprocessing what if we fork the heck out of this program? Let’s distribute the job by assigning one IP to test to each process. So, first:

hostname = socket.gethostname()
local_ip = socket.gethostbyname(hostname)    # Yep, that's the pattern
if not is_valid_ip4_address(local_ip):
    print "ATM this thingie only works with IP4 addresses"

ip_parts = local_ip.split('.')
ips_to_test = []
for i in range(1,255):
    l = ip_parts[0:3]
    l.append(str(i))
    ips_to_test.append('.'.join(l))

ips_to_test will keep a lovely string list with all the IPs in your A.B.C.* subrange. From that point calling a map it’s as trivial as it sounds.

Just to make the things a little bit more interesting, let’s add some messages to the whole thing, and since the console is a shared resource, let’s mutex the processes. What about something like this?

children_lock = None    # Mutex here
def initializer_function(lock):
    global children_lock
    children_lock = lock

def fast_ping_by_ip(hostname):
    global children_lock
    p = ping(hostname, timeout=250, count=2)
    if p is not None and p.ret_code == 0:    # the host answers
        children_lock.acquire()
        print (hostname + " found ")
        children_lock.release()
        return True
    return False

    ... 

    lock = Lock()
    process_pool = Pool(multiprocessing.cpu_count(), 
        initializer=initializer_function, initargs=(lock,))

The variable children_lock will keep a reference to the shared lock for writing to the console. But the best of all this is how the whole show is started:

does_ip_answers = process_pool.map(fast_ping_by_ip, ips_to_test)
for i in range(len(does_ip_answers)):
    if does_ip_answers[i]:
        print "Found: " + ips_to_test[i]

Look at that Pythonic beauty.

Good times.

PS: Yes, now I can find my new Raspberry in under 15 seconds, but I’ve forgotten the password dang

Comment and share

During the last months at the office I’ve been completely focused on production and tools. Which translate to quite a lot of Python and C# code. Which is good and bad, Python can be a bit addictive, it’s so fast to prototype and develop, that returning to compilation cycles feels, sometimes, like a drag. And this is particularly true if you have the luck of use Pycharm and are able to remote attach.

But at some point some weeks ago I felt a bit bored, and I wondered if there was some online course I could take to keep doing some C++ on the side. And there Coursera might help.
It’s not my first time with online training, I took an introductory course on MongoDB a couple years ago, and some others on paralell programing, but always on diferent online platforms.
I find Coursera good enough, they have a phone app that it’s convenient when reviewing the videos on your commute. But the use is not that good and sometimes it’s a bit clunky. They have some downtime from time to time (but, ey, as everyone) and the forums are a bit, ermh … well, they’re forums.

So, after a bit of browsing I decided to join a course by a CS classic C++ for C programmers that was conducted by Ira Pohl My only objective was to train a bit, get other views, and maybe learn a bit in the process. The course in itself is interestin but unless you’re extremely bored (as I was) I won’t suggest you to take it.

This was however the first time a course on programming relied on peer review for the exercises. Every one or two weeks you are supposed to turn in an exercise and other students will review and grade it. And man … there’s where I have my reservations. Reading other people’s code is always interesting but making that your only grading mechanism, that’s another thing.
In other courses they have an automated testing platform in place and, even if as a student it could be a PITA, it seems fair in general. The MongoDB course for instance had one of those and it seemed quite reasonable.

At this point I think that a combination of both: automated tests and code reviews, would be the best approach, and here I’m supposing some honesty from the students, ofc.

Comment and share

jc_bellido

My name is Carlos Bellido and I work coding games in Stockholm. I rediscovered swimming and gymns after moving to Sweden. Keep in mind that Kalles Kaviar is an an acquired taste.


I work in the audio department in FatShark