Bad data systems do not justify sexist your behavior

This week we get a letter from Atlantic Broadband, our ISP, addressed to “Aaron & Eliza Crosman Geor”. My wife has never gone by Eliza and her last name is not “Geor”.

Atlantic Broadband to: Aaron & Eliza Crosman GeorIt’s been this way since we signed up with them, when we ask them to fix it they acknowledge that they cannot because their database cannot correctly handle couples with different last names who both want to appear on the account. Apparently it is the position of Atlantic Broadband that in 2016 it is reasonable to tell a woman she cannot be addressed by her legal name because it would be expensive for them to fix their database, and therefore she must be misaddressed or left out entirely.

I consider this unacceptable from old companies, but Atlantic BB was founded in 2004 – there are probably articles about not making assumptions about people’s names that are older than their company.

Folks, it is 2016, when companies insult people and then blame their databases it is because they do not consider all their customers worthy of equal respect.

So let’s get a few basics out of the way:

  • Software reflects the biases of the people who write it and buy it.
  • If your database tells someone their name is invalid your database is not neutral. Just because you don’t get the push-back that Facebook sees when they mess this up does not mean what you’re doing is okay.
  • If your database assumes my household follows 1950s social norms, the company that uses it considers 1950s social norms acceptable in 2016 – and there are probably a few of those they don’t want to defend (I hope).
  • When an email, phone rep, or letter calls me by my wife’s last name or her by mine, in both cases they are assuming she has my last name not that I have hers. This is a sexist assumption that the company has chosen to allow.

Of course Atlantic isn’t the only company that does this: Verizon calls me Elizabeth in email a couple times a week because she must be primary on that account (one person must lead the family plan), and Nationwide Insurance had to hack their data fields for years so my wife could appear on our car insurance card (as required by law) every time we moved because their web interface no longer allowed the needed changes. The same bad design assumptions can be insulting for other reasons such as ethnic discrimination. My grandmother was mis-addressed by just about everyone until she died because in the 1960s the Social Security Administration could not handle having an ‘ in her name, and no one was willing to fix it in the 50 years that followed SSA’s uninvited edit to her (and many other people’s) name.

In all these cases representatives all say something to the effect of “our computers cannot handle it.” And that of course is simply not true. Your systems may not be setup to handle real people, but that’s because you don’t believe they should be.

Let’s check Atlantic Broadband’s beliefs about their customers based on how they address us (I’m sure there are some additional assumptions not reflected here but these are the ones they managed to encode in one line in this letter):

  • They assume they are addressing one primary account holder: I happen to know from my interactions with them that they list my first name as: “Aaron & Eliza”, and my last name as “Crosman Geor”. Plenty of households have more than one, or even two, adults who expect equal treatment in their home. Our bank and mortgage company know we are both responsible adults why is this so hard for an ISP (or insurance company, or cell provider, credit card, etc)?
  • They assume my first name isn’t very long: They allowed 13 characters, but 4 more is too many. I went to high school with a kid who broke their database by exceeding the 26 character limit it had (they didn’t ask the kid to change his name, the school database admin fixed the database), but Atlantic can barely handle half that.
  • They assume my last name isn’t very long: Only 12 characters were used and they stopped in a strange place. I know many people with last name longer than that: frequently people who have hyphenated last names blow past 12. Also the kid with a 26 character first name – his surname was longer.
  • They assume my middle name isn’t an important part of my name: If they had a middle name field, they could squeeze a few more letters in and make this read more sensibly.  But they only consider first and last names important. Plenty of people have three names – or more – they like to have included on letters.
  • They assume it is okay to mis-address me and my wife: The name listed is just plain wrong, but they believe it’s okay to keep using this greeting. They assume this even after they have been told it’s not, and even after we’ve reduced service with them (if another ISP provided service to my house I’d probably cut it entirely although mostly for other reasons). They believe misaddressed advertisements will convince me I need a landline or cable package again.

Now I’ll be fair for just a minute and note something they got right: they allow & and spaces in a name so Little Bobby Tables might be able to be a customer without causing a crisis (partially because his name is too long for them to fit a valid SQL command into the field).

Frequently you’ll hear customers blame themselves because their names are too long or they have done something outside the “norm”. Let’s be clear: this is the fault of the people who write and buy the software. Software development is entirely too dominated by men, as is the leadership of large companies. When a company lacks diversity in key roles you see that reflected in the systems built to support the work. Atlantic’s leadership’s priorities and views are reflected in how their customers are addressed because they did not demand the developers correct their sexist assumptions.

These problems are too common for us to be able to refuse to do business when it comes up. I will say that when we switched our insurance to State Farm they did not have any trouble understanding that we had different last names and their systems accommodated that by default.

If you do business with a company that makes these (or other similar mistakes) I think it’s totally reasonable to remind them every time you reasonable can that it’s offensive. Explain that they company is denying you, your loved ones, and/or your friends a major marker of their identity. Remind them they are not neutral.

If you write data systems for a living: check the assumptions you’re building into your code. Don’t blame the technology because you used the wrong character set or trimmed the field too short: disk is cheap, UTF-8 has been standard for 15+ years, and processors are fast. If the database or report layout doesn’t work because someone’s name is too long the flaw is not the name.

We all make mistakes and bad assumptions sometimes, but that does not make it okay to deny people basic respect. When we make a bad assumption, that’s a bug, and good developers are obligated to fix it. Good companies are obligated to prevent it from happening in the first place.

Try doing it backwards

As part of my effort not to repeat mistakes I have tried to build a habit in my professional – and personal – life to look for ways to be better at what I do. I recently rediscovered how much you can learn when you try doing something you know well backwards: I drove on the left side of the road.

This is the Holden Barina we rented while in New Zealand.
This is the Holden Barina we rented while in New Zealand, a brand of car I’d never heard of before this trip. It was a good car for the mountain driving even if the wipers and lights controls were reversed from cars at home.

By driving on the left I discovered how many basic driving habits I have that are built around driving on the right. The clearest being that the whole time I was in New Zealand I never knew if anyone was behind me, and the whole time I couldn’t figure out why. The mirrors on the car worked just fine, but it turned out I wasn’t looking at them. Driving home from the airport after we returned to the US I realized that every few seconds my eyes jump to the upper right of my vision to check the mirror. In New Zealand I spent the whole time glancing at the post between the windshield and the driver’s side window (which had seemed massive to me while I was there) instead of the mirror. It made me conscious of my driving habits in a way I haven’t been in years, and as a consequence, I think it’s made me a better driver. I’m thinking about little details again; I’ve been more aware of where I am on the road and what I’m doing to keep track of the other cars around me.

My wife drove this section so I got to take some pictures. Amazing scenery but she had to adjust quickly.
My wife drove this section while I got to take some pictures. She got to learn to drive on the left on winding mountain roads – we don’t recommend that approach.

A few years ago I was watching videos from the MIT Algorithms course to refresh some of my basics, and because I wanted to know what had been added in the decade since I’d taken that class at Hamilton. During the review of QuickSort the professor mentioned that it wasn’t originally a divide-and-conqueror process, but a loop based approach meant to work on a fixed length array (so you could use a fixed block of memory). And as I recall he suggests that the students should work out the loop based version. So riding on the train home from work I pieced it together, and found that it’s an elegant process. It’s not something I ever expect to have cause to implement, but it did help me improve my thinking about when to use recursive functions vs when to use a loop, and helped me think about when to use recursion, loops, and other tools for processing everything in a list. There was a session by John Kary at DrupalCon this year on rethinking loops that pushed me again to revise some of how I made those decisions. Again his talk took the reverse view of much of my previous thinking and was therefore very much worth my time.

If you’re feeling like you are in a good groove on something, try doing it backwards and see what you discover.

My Grandmother’s Hats

Near the end of her life my grandmother made hats – lots of hats. Most of them made from cheap acrylic yarns and most sized for children. She lived alone and spent much of her time sitting in her apartment watching C-SPAN and knitting. At the time her apartment was a few blocks from my office so I went to visit her about once a week and we’d chat about current events, politics, and whatever else was on her mind. If you went to visit her during the winter and you were not wearing a hat when you arrived (and since she was on the 19th floor I usually took my hat off in the elevator) you were strongly encouraged to take a few to keep warm when you left.

Blue and white handmade hat.
This is one of the hats I still have, made from yarn my mother had used several years earlier for her own project.

My mother and aunt would bring her yarn from various sales and the ends of projects, and I would occasionally take a bag or two or hats with me after a visit. I tried giving them away on the street to homeless people who often slept near her apartment or my office but they weren’t usually interested. I sent a couple bags to Afghans for Afghans (before I realized they needed items made from better fiber). We sent some to a friend who taught in Buffalo (Nanny loved the picture of children wearing her hats sent in the thank you note).

Even having given away a couple bags full, when it came time for her to stop living alone we discovered there was a closet full of grocery bags stuffed with hats. We laughed at the large number of hats she had stashed away but as her last charitable act – even though it was one she never knew about – we gave them away. My wife and I gave them to school teachers who had children in need. My sister took a few hundred to a women’s shelter. And we donated them to other useful causes when we could find people about to use them. Slowly my grandmother’s work was spread across several communities.

Fibre on the shelf
Fibre in our house that isn’t used promptly may become a decoration.

In addition to working in technology, I also spin – as in make yarn from wool and other fibers. I get more or less done depending on the ebbs and flows of life, but I generally have some fiber in the process of becoming yarn on a spinning wheel. My fibre stash is small compared to some spinners’ but since lots of my fiber is either a gift or a random purchase from events like the Maryland Sheep and Wool Festival instead of a specific project, I usually have extra yarn around too.

spinning wheel
My Ashford traditional, a gift from my mother-in-law. This is my primary spinning wheel, although I have a few others.
Wool yarn I made recently.
Wool yarn I made recently from fibre my mother-in-law gave me a couple years ago .

I also knit. And yes, I knit hats. Visiting in her apartment talking about the hats she was making Nanny shared – more than once – the pattern she used to make all those hats. She’d memorized it from some magazine or another years earlier (or at least a pattern like it) and so I typically make them about them same as she did:

  1. Pick a size needle that works with the yarn you have, and will result in the size you want (this takes trial and error if you aren’t an experienced knitter who knows their gage).
  2. Cast on 96 stitches (she sometimes said 76, but usually it was 96 and that appeals to my techie nature).
  3. Work 2 inches in Knit 2, Purl 2.
  4. Work 3 inches in Stockinette.
  5. Knit 2 together, Knit 6, repeat to the end of the row. Purl back.
  6. Knit 2 together, Knit 5 repeat to the end of the row. Purl back.
  7. Continue this way until, decreasing the number of stitches between each gather until your gathers start to collide. Then knit 2 together every stitch (still purl back).
  8. When you have 4 or 5 stitches on the needles, cut the yarn with about a foot of extra.
  9. Pull the end back through the remaining loops, and stitch down the open side.
This is one of mine made from some yarn that was started during a workshop I did on spinning with a drop spindle for high school kids.
This is one of mine made from some wool left over from a workshop I did on spinning with a drop spindle and some singles I used to teach a friend’s kids a little about spinning. Once plied together the variation in the singles formed a surprisingly good yarn with interesting color mixes.

I’ve made hats for friends and family, although certainly not in the volume she produced them. Last winter I realized I could use some of my stash of fiber to start making them to give away like we did with hers. And so now I have a slowly growing collection of hats (some with matching scarves) in the hopes that I am able to do something half as useful with them as we managed with my grandmother’s.

Here's one of the matching scarves. These are sized for child, but I'll probably make more sizes over time.
Here’s one of the matching scarves. These are sized for child, but I’ll probably make more sizes over time.

My grandmother was a challenging person in many ways. But she always tried to be nice to strangers and people in need. So this has become my tribute to the parts of her that I loved most.

Looking at a project from different angles

For our 15th anniversary my wife and went to the south island of New Zealand, with a long layover in Sydney. We only had a few hours in Sydney so we went to see the Opera House and then walk through the botanical gardens next door.

As we walked around the harbor I took pictures of the opera house from several different angles. And that got me thinking about the advice I’ve been given both about photography and about my work: make sure you try things from different angles.

A classic angle of the Sydney opera house from across the harbor.

Too often all kinds of experts get into a rut and lose track of the perspective non-experts, and other experts with whom they disagree. Cable news channels like to package those ruts as two talking heads yelling at each other by calling it “debate”.

It’s an easy trap to fall into even without watching the people paid to yell at each other. Sometimes when we look at a problem twice it looks different because we changed something small, and we think we’ve seen all the valid angles. But we’ve just reinforced our sense of superiority not actually explored anything interesting yet.

When you look right at the sun a small change can have a large impact, but you may still be fundamentally in the same place with a fundamentally flawed perspective.

And sometimes you look from a new angle and something easily recognizable becomes new and different, but that’s not always an improvement. There are reasons for best practices, and sometimes we just reinvent the wheel when we try to break our own path.

You don't see pictures of the opera house from this angle often – which is probably for the best.
You don’t see pictures of the opera house from this angle often – which is probably for the best.
This angle was even worse. It's a good thing I wasn't using film for this exercise.
This angle was even worse. It’s a good thing I wasn’t using film for this exercise.

And sometimes it is important to think about the extra details that you can capture by changing perspectives and taking the time to figure out the best approach.

Opera House with sailboat
I had to wait a few minutes for the sailboat to get into a spot that made it look right.
Sometimes too much context is too distracting.
Sometimes too much context is too distracting and makes it hard to know what you’re supposed to look at.

But when you take the time to look at things from different angles, perspectives, and positions sometimes you get to discover something you didn’t know to ask about.

This little guy and an older buddy spend lots of time in the sun on these steps behind the opera house – I had no idea they were there until we were walking around.
This little guy and an older buddy spend lots of time in the sun on these steps behind the opera house – they are well known locally, but I had no idea they were there until we were walking around.

For me the best moments are those gems you find when you take the time to explore ideas and view points and discover something totally new. Nothing beats travel to help you remember to change your perspective now and again.

Picking tools you’ll love: don’t make yourself hate it on day one.

Every few years organizations replace a major system or two: the web site, CMS, CRM, financial databases, grant software, HR system, etc. And too often organizations try to make the new tool behave just like the old tool, and as a result hate the new tool until they realize that they misconfigured it and then spend 5-10 years dealing with problems that could have been avoided. If you’re going to spend a lot of money overhauling a mission critical tool you should love it from day one.

No one can promise you success, but I promise if you take a brand new tool and try to force it to be just like the tool you are replacing you are going to be disappointed (at best).  Salesforce is not CiviCRM, Drupal is not WordPress, Salsa is not Blackbaud. Remember you are replacing the tool for a reason, if everything about your current tool was perfect you wouldn’t be replacing it in the first place. So here are my steps for improving your chances of success:

  1. List the main functions the tool needs to accomplish: This is the most obvious thing to do, but make sure your list only covers the things you need to do, not the ways you currently do it. Try to keep yourself at a relatively high level to avoid describing what you have now as the required system.
  2. List the pros and cons of what you have: Every tool I’ve ever used had pluses and minuses. And most major internal systems have stakeholders who love and hate it – sometimes that’s the same person – make sure you capture both the good and bad to help you with your selection later.
    Develop a list of tools that are well known in the field: Not just tools you know at the start of the project. Make sure you hunt for a few that are new to you. You might think you’ve heard of them all cause you walked around the vendor hall at NTC last year, but I promise you there are more companies that picked a different conference to push their wares, and there are open source tools you might have missed too.
  3. Make sure every tool has a salesperson: Open Source tools can be overlooked because no one sells them to you, and that may mean you miss the perfect tool for your organization. So for open source even the playing field by having a salesperson, or champion, for the tool. This can be an internal person who likes learning new things, or an outside expert (usually paid but sometimes volunteer).
  4. Let the sales teams sell, but don’t trust them: Let sales people run through their presentations, because you will learn something along the way. But at some point you also need to ask them questions that force them off your script. Force a demo of a non-contrived example, or of a feature they don’t show you the first time. Make them improvise and see what happens.
  5. Talk to other users, and make sure you find one who is not happy: Sure your organization is unique but lots of other organizations have similar needs for the basic tools – unless you have a software-based mission you probably do not want an email system that’s totally different from everyone else’s. A good salesperson will have no trouble giving you a list of references of organizations who love the tool, but if you want the complete picture find someone who hates it. They might hate it for totally unfair reasons, but they will shed light on the rough edges you may encounter. Also make sure you ask the people who love it what problems they run into, remember nothing is perfect so everyone should have a complaint of some kind.
  6. Develop a change strategy: In addition to a data migration plan you need to have a plan that covers introducing the new tool to your colleagues, training the users, communicating to leadership the risks and rewards of the new setup, and setting expectations about any disruptions the change over may cause.  I’ve seen an organization spend nearly a half million dollars on customization of a complex toolset only to have the launch fail because they didn’t make sure the staff understood that the new tool would change their day-to-day tasks.
  7. Develop a migration plan: Plan out the migration of all data, features, and functions as soon as you have your new tool selected. This is not the same thing as your change strategy, this is nuts and bolts of how things will work. Do not attempt to do this without an expert. You made yourself an expert in the field, but not of every in-and-out of the new system: hire someone who is.  That could be a setup team from the company that makes it, a 3rd party consultant, or a new internal staff person who has experience with different instances of the tool.
  8. Get staff trained on using the new tool: don’t scrimp on staff training. Make sure they have a chance to learn how to do the things they will actually be doing on a day-to-day basis.  If you can afford to have customized training arranged I highly recommend it, if you cannot have an outside person do it, consider custom building a training for your low-level internal users yourself.
  9. Develop a plan for ongoing improvement: you will not be 100% happy 100% of the time, and over time those problems will get worse as your needs shift. So make sure you are planning to consistently improve your setup. That can take many forms and what makes the most sense will vary from tool to tool and org to org, but it probably will mean a budget so ask for money from the start and build it into your ongoing budget for the project. Plan for constant improvement or you will find a growing list of pain points that push you to redo all this work sooner than expected.You’ll notice I never actually told you to make your choice. Once you’ve completed steps 1-6 you probably will see an obvious choice, of not: guess. You have a list, you listened to 20 boring sales presentations, you’ve read blogs posts, white papers, and ad materials. You now are an expert on the market and the tools. If you can’t make a good pick for your organization, no one else can either so push aside your imposter syndrome and go with your gut. Sure you could be wrong, but do the best you can and move forward. It’s usually better to make a choice than waffle indefinitely.

Sins Against Drupal 2

This is part of my ongoing series about ways Drupal can be badly misused. These examples are from times someone tried to solve an otherwise interesting problem in just about the worst possible way.

I present these at SC Drupal Users Group meetings from time to time as an entertaining way to discuss interesting problems and ways we can all improve.

This one was presented about a year ago now (August 2015). Since I wasn’t working with Drupal 8 when I did this presentation the solution here is Drupal 7 (if someone asks I could rewrite for Drupal 8).


The Problem

The developer needed to support existing Flash training games used internally by the client. Drupal was used to provide the user accounts, game state data, and exports for reporting. The games were therefore able to authenticate with Drupal and save data to custom tables in the main Drupal database. The client was looking for some extensions to support new variations of the games and while reviewing the existing setup I noticed major flaws.

 

The Sinful Solution

Create a series of bootstrap scripts to handle all the interactions, turning Drupal into a glorified database layer (also while you’re at it, bypass all SQL injection attack protections to make sure Drupal provides as little value as possible).

The Code

There was a day when bootstrap scripts with a really cool way to do basic task with Drupal. If you’ve never seen or written one: basically you load bootstrap.inc, call drupal_bootstrap() and then write code that takes advantage of basic Drupal functions – in a world without drush this was really useful for a variety of basic tasks. This was outmoded (a long time ago) by drush, migrate, feeds, and a dozen other tools. But in this case I found the developer had created a series of scripts, two for each game, that were really similar, and really really dangerous. The first (an anonymized is version shown below) handled user authentication and initial game state data, and the second allowed the game to save state data back to the database.

As always the script here was modified to protect the guilty, and I should note that this is no longer the production code (but it was):

<?php 
require_once './includes/bootstrap.inc'; 
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); // "boot" Drupal 
define("KEY", "ed8f5b8efd2a90c37e0b8aac33897cc5"); // set key 

// check data 
if(!(isset($_POST['hash'])) || !(isset($_POST['username'])) || !(isset($_POST['password']))) { 
  header('HTTP/1.1 404'); 
  echo "status=100"; 
  exit; // missing a value, force quit 
} 

// capture data 
$hash = $_POST['hash']; 
$username = $_POST['username']; 
$password = $_POST['password']; 

// check hash validity 
$generate_hash = md5(KEY.$username.$password); 
if($generate_hash != $hash) {
  header('HTTP/1.1 404'); 
  echo "status=101"; 
  exit; // hash is wrong, force quit 
} 

// look for username + password combo 
$flashuid = 0; 
$query = db_query("SELECT * FROM {users} WHERE name = '$username' AND pass = '$password'"); 
if ($obj = db_fetch_object($query)){ 
  $flashuid = $obj->uid;
}

if($flashuid == 0) {
  header('HTTP/1.1 404');
  echo "status=102";
  exit; // no match found
}

// get user game information
$gamequery = db_query("SELECT * FROM {table_with_data_for_flash_objects} WHERE uid = '$flashuid' ORDER BY lastupdate DESC LIMIT 1");

if ($game = db_fetch_object($gamequery)){
  $time = $game->time;
  $round = $game->round;
  $winnings = $game->winnings;
  $upgrades = $game->upgrades;
} else {

  // no entry, create one in db
  $time = $round = $game_winnings = $long_term_savings = $bonus_list = "0";
  $upgrades = "";
  $insert = db_query("INSERT INTO {table_with_data_for_flash_objects} (uid, lastupdate) VALUES ('$flashuid',NOW())");
}

$points = userpoints_get_current_points($flashuid);

// echo success and values
header('HTTP/1.1 201');
echo "user_id=$flashuid&points=$points&ime=$time&round=$round&winnings=$winnings&upgrades=$upgrades";

?>

Why is this so bad?

It’s almost hard to know where to be begin on this one, so we’ll start at the beginning.

  • Bootstrap scripts are not longer needed and should never have been used for anything other than a data import or some other ONE TIME task.
  • That key defined in line 3, that’s used to track sessions (see lines 20-21). If you find yourself having to recreate a session handler with a fixed value, you should assume you’re doing something wrong. This is a solved problem, if you are re-solving it you better be sure you know more than everyone else first.
  • Error handling is done inline with a series of random error status codes that are printed on a 404 response (and the flash apps ignored all errors). If you are going to provide an error response you should log it for debugging the system, and you should use existing standards whenever possible. In this case 403 Not Authorized is a far better response when someone fails to authenticate.
  • Lines 15-17, and then line 30: a classic bobby tables SQL Injection vulnerability. Say goodbye to security from here on in. They go on to repeat this mistake several more times.
  • Finally, just to add insult to injury, the developer spends a huge amount of time copying variables around to change their name: $password = $_POST[‘password’]; $round = $game->round; There is nothing wrong just using fields on a standard object, and while there is something wrong with just using a value from $_POST, copying it to a new variable does not make it trustworthy.

Better Solutions

There are several including:

  • Use a custom menu to define paths, and have the application just go there instead.
  • Use Services module: https://www.drupal.org/project/services
  • Hire a call center to ask all your users for their data…

If I were starting something like this from scratch in D7 I would start with services and in D8 I’d start with the built-in support for RESTful web services. Given the actual details of the situation (a pre-existing flash application that you have limited ability to change) I would go with the custom router so you can work around some of the bad design of the application.

In our module’s .module file we start by defining two new menu callbacks:

function hook_menu() {

  $items['games/auth'] = array(
    'title' => 'Games Authorization',
    'page callback' => 'game_module_auth_user',
    'access arguments' => array('access content'),
    'type' => MENU_CALLBACK,
  );  
  $items['games/game_name/data'] = array( // yes, you could make that a variable instead of hard code
    'title' => 'Game Data',
    'page callback' => 'game_module_game_name_capture_data', // and if you did you could use one function and pass that variable
    'access arguments' => array('player'),
    'type' => MENU_CALLBACK,
);

return $items;
}

The first allows for remote authentication, and the second is an endpoint to capture data. In this case the name of the game is hard coded, but as noted in the comments in the code you could make that a variable.

In the original example the data was stored in a custom table for each game, but never accessed in Drupal itself. The table was not setup with a hook_install() nor did they need the data normalized since its all just pass-through. In my solution I switch to using hook_install() to add a schema that stores all the data as a blob. There are tradeoffs here, but this is a clean simple solution:

...
'fields' => array(
'recordID' => array(
'description' => 'The primary identifier for a record.',
'type' => 'serial',

...
'uid' => array(
'description' => 'The user ID.',
'type' => 'int',

...
'game' => array(
'description' => 'The game name',
'type' => 'text',

...
'data' => array(
'description' =>'Serialized data from Game application',
'type' => 'blob',

...

You could also take this one step further and make each game an entity and customize the fields, but that’s a great deal more work that the client would not have supported.

The final step is to define the callbacks used by the menu items in hook_menu():

function game_module_auth_user($user_name = '', $pass = '') { // Here I am using GET, but I don’t have to

  global $user;
  if($user->uid != 0) { // They are logged in already, so reject them
    drupal_access_denied();
  }

  $account = user_authenticate($user_name, $pass);

  //Generate a response based on result....
}

function game_module_[game_name]_capture_data() {
  global $user;
  if($user->uid == 0) { // They aren’t logged in, so they can’t save data
    drupal_access_denied();
  }

  $record = drupal_get_query_parameters($query = $_POST); // ← we can work with POST just as well as GET if we ask Drupal to look in the right place.

  db_insert('game_data')
    ->fields(array(
      'uid' => $user->uid, 
      'game' => '[game_name]',
      'data' => serialize($record),
     ))
    ->execute();
  // Provide useful response.
}

For game_module_auth_user() I use a GET request (mostly because I wanted to show I could use either). We get the username and password, have Drupal authenticate them, and move on; I let Drupal handle the complexity.

The capture data callback does pull directly from the $_POST array, but since I don’t care about the content and I’m using a parameterized query I can safely just pass the information through. drupal_get_query_parameters() is a useful function that often gets ignored in favor of more complex solutions.

So What Happened?

The client had limited budget and this was a Drupal 6 site so we did the fastest work we could. I rewrote the existing code to avoid the SQL Injection attacks, moved them to SSL, and did a little other tightening, but the bootstrap scripts remained in place. We then went our separate ways since we did not want to be responsible for supporting such a scary set up, and they didn’t want to fund an upgrade. My understanding is they heard similar feedback from other vendors and eventually began the process of upgrade. You can’t win them all, even when you’re right.

Share your sins

I’m always looking for new material to include in this series. If you would like to submit a problem with a terrible solution, please remove any personally identifying information about the developer or where the code is running (the goal is not to embarrass individuals), post them as a gist (or a similar public code sharing tool), and leave me a comment here about the problem with a link to the code. I’ll do my best to come up with a reasonable solution and share it with SC DUG and then here. I’m presenting next month so if you have something we want me to look at you should share it soon.

If there are security issues in the code you want to share, please report those to the site owner before you tell anyone else so they can fix it. And please make sure no one could get from the code back to the site in case they ignore your advice.

Sins Against Drupal 1

This is the first is an ongoing series about ways Drupal can be badly misused. These are generally times someone tried to solve an otherwise interesting problem in just about the worst possible way. All of these will start with a description of the problem, how not to solve it, and then ideas about how to solve it well.

I present these at SC Drupal Users Group meetings from time to time as an entertaining way to discuss ways we can all improve our skills.

This first one was presented awhile ago now (Feb of 2015).


The Problem

The developer needed to support an existing JavaScript app with access to content in the form of Drupal nodes encoded in JSON. This is a key part of any headless Drupal project (this entire site was not headless, just one application), and in Drupal 7 and earlier there was no way to do this in core.

The Sinful Solution

Create a custom response handler within template.php that executes every time template.php is loaded.

The Code

During a routine code review of this site I found the following code in template.php:

...
function theme_name_preprocess_region(&$vars) {
  if ($vars['region'] == 'header') {
    $vars['classes_array'][] = 'clearfix';
  }
}

if (isset($_POST['mode'])) {
    if ($_POST['mode'] == 'get_fields_node') {
        $node = node_load($_POST['id']);

        $container = array();

        $sliderCounter = count($node->field_event_slider['und']);
        for ($i = 0; $i < $sliderCounter; $i++) { $field_collection_id = $node->field_event_slider['und'][$i]['value'];
            $field_collection = entity_load('field_collection_item', array($node->field_event_slider['und'][$i]['value']));

            $currentCollectionItem = $field_collection[$field_collection_id];

            if (isset($currentCollectionItem->field_slider_image['und'][0]['uri'])) {
                $container['slider'][$i]['src'] = file_create_url($currentCollectionItem->field_slider_image['und'][0]['uri']);
            }
            if (isset($currentCollectionItem->field_slider_caption['und'][0]['value'])) {
                $container['slider'][$i]['caption'] = $currentCollectionItem->field_slider_caption['und'][0]['value'];
            }
        }

        $focusTid = $node->field_event_type['und'][0]['tid'];
        $eventTerm = taxonomy_term_load($focusTid);
        $dateTid = $node->field_event_date['und'][0]['tid'];
        $dateTid = taxonomy_term_load($dateTid);

        $container['nid'] = $node->nid;
        $container['focusTid'] = $focusTid;
        $container['title'] = $node->title;
        $container['focus'] = $eventTerm->name;
        $container['date'] = $dateTid->name;
        $container['body'] = $node->body['und'][0]['value'];

        print json_encode($container);
        die();
    }

    if ($_POST['mode'] == 'get_fields_focus') {
        $focusTerm = taxonomy_term_load(substr($_POST['id'], 4));

        $container['nid'] = $_POST['id'];
        $container['title'] = $focusTerm->name;
        $container['body'] = $focusTerm->description;

        print json_encode($container);
        die();
    }
}
...

Notice that between the two functions is the random block of code wrapped in
if (isset($_POST['mode'])) {...}. So on every request that results in load the theme template.php is loaded and in addition to the normal parsing triggers a check to see if the page request was a POST that included a mode. It it was, we then proceed to load up a node, a couple taxonomy terms, and then encode them as JSON for response. The site sends the response and then unceremoniously dies().

Why is this so bad?

First, there is no parameter checking on the ID parameter: node_load($_POST[‘id’]). Anyone on the internet can load any node if they work out the ID. Since nodes are sequentially number, you could just start at 1 and your way up until they noticed that the site was sending the same useless response over and over. It also doesn’t send a 404 if the NID provided is invalid.

Second, no reasonable developer would expect you to hide a custom callback handler in template.php. It should be totally safe to load template.php without generating output under any condition (that should be true all non-tpl.php files).

Third, this code could run during any page request, not just the ones the application designer thought about. The request could have had Drupal do something relatively expensive before reaching this stage, and all that work was just wasted server resources – which creates an additional avenue for an attacker.

Fourth, Drupal has an exit function that actually does useful clean up and allows other modules to do the same. All that gets bypassed when you just die() midstream.

Finally, Drupal has tools to do all this. There was no reason to do this so badly.

Better Solutions

In Drupal 8 this is part of core.  Enable Restful Web Services and optionally the RestUI module, and in a few minutes you can have this more or less out of the box.

In Drupal 7 are two modules that will do 90% of the work for us. If we really just want the raw node as JSON you could use Content as JSON. But often we want more control over field selection, and the option to pull in related content (which the developer in this case did, and used as his argument for the approach taken). Views Datasource gives us the power of Views to select the data and provides us a JSON (and a few other) display option.

Views Datasource based approach:

  1. Install Views, ctools, and Views Datasource
  2. Create your view and set the display format to JSON data document.Drupal Sins 1 Views Definition
  3. Pick your fields, set the path, and define the contextual filter:Drupal Sins 1 Views field

From there save your view and you’re done. That’s all there is to it. No custom code to maintain, you get to rely on popular community tools to handle access checking and other security concerns, and you get multiple layers of caching.

So what happened?

This sin never saw the light of day.

At the time I encountered this “solution” I worked for a company that was asked review this site while it was still under development. Our code review gave the client a chance to go back to the developer and get a fix. The developer chose a more complicated solution than the Views one presented here (they defined a custom menu router with hook_menu() and moved much of this into the callback and added a few security checks) which was good enough for the project. But I still would have done it in views: it is much faster to develop, views plays nicely with Drupal caching to help improve performance, and is a straightforward approach is easy for a future developer to maintain.

Share your sins

I’m always looking for new material to include in this series. If you would like to submit a problem with a terrible solution, please remove any personally identifying information about the developer or where the code is running (the goal is not to embarrass individuals), post them as a gist (or a similar public code sharing tool), and leave me a comment here about the problem with a link to the code. I’ll do my best to come up with a reasonable solution and share it with SC DUG and then here.

If there are security issues in the code you want to share, please report those to the site owner before you tell anyone else so they can fix it. And please make sure no one could get from the code back to the site in case they ignore your advice.

This Week’s Drupal Fire Drill

This week many in the Drupal community lost a lot of sleep Tuesday night because the security team treated us to a warning about major security updates due out on Wednesday. Fortunately for many it wasn’t a crisis in the end, but it gave us all a chance to practice for the worst. Basically, it was like a fire drill in a elementary school: we got to prepare like there was a disaster, but since they wasn’t one we don’t really know how it would have gone if there was actually a fire. We haven’t had a stop-drop-and-roll type of emergency in a while, so it was a good refresher on how to handle a crisis.

For those who don’t know what I’m talking about here’s a quick review. At Cyberwoven, like many Drupal shops, we follow the Drupal Security twitter feed on one of our Slack channels so we saw this mid-afternoon Tuesday:

Slack posting of tweet from the security team.

I read the PSA with images of Drupalgeddon dancing in my head:

There will be multiple releases of Drupal contributed modules on Wednesday July 13th 2016 16:00 UTC that will fix highly critical remote code execution vulnerabilities (risk scores up to 22/25).These contributed modules are used on between 1,000 and 10,000 sites. The Drupal Security Team urges you to reserve time for module updates at that time because exploits are expected to be developed within hours/days. Release announcements will appear at the standard announcement locations.

Drupal core is not affected. Not all sites will be affected. You should review the published advisories on July 13th 2016 to see if any modules you use are affected.

Oh that bold line up there wasn’t part of the original announcement. On Tuesday we didn’t have a sense of scale, were we talking about modules that everyone uses on almost every site (ctools came up more than once). It’s the one thing I wish the security team had done differently: given us that sense of scale.

I read all security postings, and make sure we take prompt steps to address them for clients as needed, but the potential here was that we’d have to update all 70+ sites in a few hours or less which is very different from your run-of-the-mill security update that often aren’t related to use cases and threat profiles for the majority of sites.

Here’s what we did next:

Tuesday

  1. Took a minute to panic, complain, and joke about pending illnesses. This is actually a useful step because it allowed me to burn off some nervous energy and then to focus on the real work.
  2. Pulled out the list of all active clients with Drupal sites, and doubled checked it for accuracy.
  3. Made sure a developer had a working repo for all sites (70+). Since we had a couple people out of the office, and some projects had been reassigned recently to different developers, this was an important step to make sure no sites fell through the cracks during a rush to update them all.
  4. Made sure we knew which were 6 sites, and which were 7 in case we are able to determine that 6 is also affected. Since I knew the announcement would likely skip D6, we needed to accept that we might have to take those sites offline for a time.
  5. Made sure leadership knew that all developers may be busy start at 16:00 UTC on Wednesday. We didn’t actually cancel anything right away, but I didn’t want anyone surprised if we were all too busy posting and testing updates to worry about things like meetings.
  6. Made sure complex projects were thought about ahead of time: sites with unusual setups or ongoing dev work that make sudden updates complex. For example we have one client that has 16 sites that all have an unusual set up, so we agreed who would handle those and made sure she was prepared.

Wednesday

  1. First thing in the morning I saw the update to the notice that gave us a sense of scale and relaxed a little, but still made sure we were fully prepped.
  2. Noon: the announcement of what modules were affected was released and a couple other developers and I immediately reviewed the releases. We relaxed once we determined none of our clients were using any of the modules listed.
  3. I reviewed code from each of the three modules to see what the change was to look for ways to improve my own code to avoid similar errors.
  4. Looked for ways to improve our response for the next time it’s not a drill.

Things would have been more exciting if we’d had to update our sites. Since we were prepared it was a matter of minutes for us to check that all our sites were secure. Each developer checked all the sites in their sandbox, and since I knew all sites were in someone’s sandbox that gave us 100% coverage without having to do lots of double checks.

I think it is too easy to look past doing the code review of modules we weren’t using but I find this kind of follow up really useful. Looking back on Drupalgeddon it’s amazing how much pain was caused by such a small error (16 characters are all that were needed to fix it). And by seeking to understand what went wrong you can look for places that you make similarly invalid assumptions.

If you read my post on making new mistakes, you also know I believe that looking for improvement is the most important detail (particularly when it turned out to be a drill, not a fire). Here’s my initial list of things to improve:

  • Have a system to automatically check every site for specific modules (this was already under development but will take a little while longer to complete).
  • Make sure at least two developers have a working sandbox for all projects at all times in case something comes out during a vacation.
  • Improve internal messaging about what to expect – template message and process.  I tossed together some disjointed thoughts for account managers. But disjointed developer thinking does not make people feel like you’re on top of things.
  • Have better tracking of outliers:  I completely missed that I had a demo site on Pantheon that did have the Coder module running.  Since it was Pantheon they alerted me to this problem and had taken steps until I could do the update myself. But it would have been bad if that site had been someplace else and/or in production.
  • Make sure everyone one knows when the actual release is coming out, and what the outcome was. Several developers were hoping to get lunch before the updates, but hadn’t done so when the announcement came out (which could have been a problem if we’d ended up busy).  And I spent the rest of the day answering one-off questions from the account team who wanted to know if the announcement had been bad news.

Ideally we’d come up with a method to automate security updates (maybe all updates), but that’s not totally straightforward.  We have to worry about required patches, non-standard setups, automated testing, and other details. There has been discussion on the Pantheon power user’s mailing list, but every shop has a slightly different workflow (like the fact that we don’t use Pantheon much at all) so we’ll need to come up with a system that accommodates our system.

When Should I Update my Drupal Site to Drupal 8?

Last year Drupal 8 finally arrived, and brought the question that comes with every new release of Drupal: when should I update? New releases of Drupal mean two things: new features and cool new tools, and the retirement of an old version. We got the power and flexibility of Symfony and Drupal 6 sites are no longer getting community support. Unlike WordPress, which has well defined upgrade paths, each version of Drupal is a new adventure in upgrade pain. The more I watch people suffer with this pain, and the more I watch them try to find a way to do upgrades that preserve their site’s fundamental structure, the more I come to the conclusion that this pain is telling us something: we’re doing it wrong. Not because Drupal’s strategy is wrong, but because keeping all your content in the same structures is usually wrong. Drupal 8 should not make it easy for you to continue to use an old strategy, it should encourage us to update old assumptions.

Here is how I encourage everyone to view their choices:

If you have a Drupal 6 (or older) site you should update right now. Drupal 6 is no longer getting security updates so you are on borrowed time. But more importantly Drupal 8 is a better tool for the current state of the web than your Drupal 6 site. Most sites running on D6 reflect an online communications strategy that’s at least 4 or 5 years old. Those sites probably aren’t responsive, aren’t prepared to support apps, don’t have the right focus on social media and user engagement, and make assumptions about user behaviors that have evolved. Skip to Drupal 8: do not migrate these sites to Drupal 7. If there is a tool that is missing from Drupal 8 that your current site uses make sure you need it before complaining (or paying to have someone port it for you). Maybe that tool hasn’t been ported because it doesn’t make sense anymore. Some things are still missing, but lots of things are being rewritten differently because we have a better platform. The community is smarter than it was 5 or 10 years ago, and the platform is better, take the time to figure out why something hasn’t been ported: is it just no one has bothered, or has something better been built instead?

If you have a Drupal 7 site you should update when your web site no longer supports your work.  This is actually the same advice I just gave, but without a few assumptions like “you need a secure site.” Many Drupal 7 sites have a lot of life left in them. A site you built today will be designed to meet the needs you have now, and the ones you foresee in the near future. Three years from now (when Drupal 7 is scheduled to lose support from the community) you will be operating on assumptions that have probably been wrong for at least two years. Every six months you should ask yourself: does my site reflect my online strategy, and is my strategy still working? If the answer is yes to both of those questions you are fine, if the answer is no to either – particularly the second – you should engage someone to help you update your strategy and rebuild your site.

I’ve been part of projects that failed in part because we tried to port a stale strategy and stale content to a fresh site. We broke the new site before it even launched. Don’t try to make Drupal 8 behave like your old site: embrace the change.

Nonprofits Drive Innovation in Online Communications

I spent ten years working at a nonprofit organization wishing I had the kinds of resources that large corporations can put toward their marketing efforts. A nonprofit the organization’s web site and related marketing are usually seen as overhead, and overhead is bad, therefore budgets limited. Nonprofit budgets are tight in general which doesn’t leave a lot of extra room for fancy services, tools, and consultants.

Then I started to work with large corporations. Turns out, all that money doesn’t necessarily bring you people who know how to spend it well.  Yes the margins are bigger, and there is less complaining about the basic costs of doing business, but when it comes right down to it they aren’t any more strategic than a small scrappy team of people in the communications department of any organization large enough to have a communications team.

This shouldn’t have been a surprise.  A great deal has been written about start-up culture and ways to help companies recreate the energy, passion, and creativity of their lean early days.  And there has been a great deal written about impostor syndrome which nonprofit communications staff tend to have in spades.

Of course I’m speaking here in sweeping generalities about two massive groups, but here is what I’ve seen working with both nonprofits and for-profits:

  1. As a group nonprofit staff are there because they care about the cause(s) of the organization, and they are driven to help the organization succeed despite their lack of resources.
  2. The lack of resources — both in terms of time and money — forces NPOs to find creative solutions to their problems. They moved aggressively into social media because it was a free way to spread their message: companies then used the lessons learned by nonprofits to craft their early engagements with social media.
  3. Due to corporate donations, nonprofits actually have access to the best software tools money can buy. Salesforce, NetSuite, Google, Microsoft, Adobe, and others give nonprofits amazing discounts that allow them access to tools companies twice their size can barely afford. I used to (legally) get $20,000 server packages from Microsoft of $200. Google gives $10,000/month ad-word grants. SalesForce and NetSuite provide amazing tools at amazing prices.
  4. Nonprofits are right to believe if they had access to better tools and more money they could do even better. Tools written for nonprofits tend to be second rate (look at the vast majority of fundraising toolkits), and they are held back in the places where they need specialized software. I have friends that write this stuff, they work hard, but with literally billions less in resources they have a big hill to climb.
  5. Organizations like N-TEN have been helping nonprofits learn from each other and from the best of the for-profit world for nearly 15 years.  That community has benefited thought leaders like Beth Kanter, John Kenyon, Ryan Ozimek, and others who help NPOs focus on their goals instead of their tools.
  6. For-profit marketing staff do not believe they have anything to learn from nonprofits, and are often making mistakes that the subject of basic talks at conferences like NTC 5 years ago.

Nonprofits often struggle to figure out the right way to leverage new tools because they try to leverage them first. When traditional companies start trying to market in new spaces they sometimes make it look easy because they have a path to follow.  A path broken by nonprofits.