<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://hubofco.de/feed.xml" rel="self" type="application/atom+xml" /><link href="https://hubofco.de/" rel="alternate" type="text/html" hreflang="en-US" /><updated>2026-01-08T13:26:23+00:00</updated><id>https://hubofco.de/feed.xml</id><title type="html">Hubofcode</title><subtitle>My personal blog on making software.</subtitle><author><name>@samueljabiodun</name></author><entry><title type="html">The Hard Parts of Building Document Delta Sync on Mobile and How I Solved Them</title><link href="https://hubofco.de/mobile%20development/software%20engineering/2025/12/31/the-hard-parts-of-building-document-delta-sync-on-mobile-and-how-i-solved-them/" rel="alternate" type="text/html" title="The Hard Parts of Building Document Delta Sync on Mobile and How I Solved Them" /><published>2025-12-31T19:34:00+00:00</published><updated>2025-12-31T00:00:00+00:00</updated><id>https://hubofco.de/mobile%20development/software%20engineering/2025/12/31/the-hard-parts-of-building-document-delta-sync-on-mobile-and-how-i-solved-them</id><content type="html" xml:base="https://hubofco.de/mobile%20development/software%20engineering/2025/12/31/the-hard-parts-of-building-document-delta-sync-on-mobile-and-how-i-solved-them/"><![CDATA[<p>I recently built document auto-sync feature with Google Drive and iCloud for my personal hobby project — <a href="https://keeplys.com">Keeplys</a>. This blog post explains the challenges and my approach to solving them.</p>

<p>For context, Keeplys is a hobby project: a simple document manager running on a mobile phone that I’ve been building in my free time. When you store or scan documents via Keeplys, it organises them into a folder locally on the device.</p>

<p>Keeplys is local-first by default. Your documents live on your device but there is an important question: what happens if a device is lost? How do you recover your documents? It wasn’t hard to see that this was a problem worth solving.</p>

<p>Before building the sync feature, the main question on my mind was: how do you sync files to the cloud? I wanted to be able to scan documents on my phone, organise them locally, and have the files backed up to the cloud.</p>

<p>The easy answer would have been to simply upload files, or download updates once a day. This approach quickly fell short when you consider:</p>

<ul>
  <li>What if I (or a user) edit a document on my phone while editing the same document on their laptop? How does the app resolve conflicts?</li>
  <li>What if the network drops mid-upload while syncing files?</li>
  <li>What if a user runs out of cloud storage?</li>
  <li>What if someone deletes the parent cloud folder used for backup (Keeplys) directly from Google Drive’s web interface?</li>
  <li>What about iCloud, which isn’t even a REST API?</li>
</ul>

<p>This is a story of how I built a sync engine that handles all of this.
<img src="/uploads/Screenshot%202025-12-31%20at%2021.39.37.png" alt="images" /></p>

<h3 id="implementation-guiding-principles">Implementation guiding principles</h3>

<p>I’ve come to understand the power of setting guiding principles when exploring choices and narrowing options down. I leaned on this approach and defined a few principles that I wanted the feature to adhere to.</p>

<h4 id="the-app-must-work-perfectly-offline">The app must work perfectly offline</h4>

<p>Cloud sync is a feature and never should it be a dependency. Users should never see a loading spinner waiting for the sync.</p>

<p>This means:</p>

<ul>
  <li>Documents are saved locally first, always</li>
  <li>Sync happens asynchronously in the background</li>
  <li>Pending uploads queue up and execute when connectivity returns</li>
</ul>

<h4 id="cloud-deletion-shouldnt-interfere-with-what-is-stored-locally">Cloud deletion shouldn’t interfere with what is stored locally</h4>

<p>The most dangerous operation in any sync system is delete. Google Drive and iCloud should never silently delete local data.</p>

<p>If a user deletes a file from Google Drive’s web interface, it shouldn’t mirror that action locally. Instead, the document should be marked as <em>unsynced</em> and the user should decide what to do.</p>

<h4 id="drive-agnostic">Drive agnostic</h4>

<p>I want to leave room for extension. The first iteration supports Google Drive and iCloud, but I want to allow for other providers like Dropbox, Box, etc. To support this, the design must be drive agnostic. I designed a single sync engine that talks to a <code class="language-plaintext highlighter-rouge">DriveProvider</code> interface.</p>

<h4 id="documents-should-have-observable-state">Documents should have observable state</h4>

<p>Documents should have visible sync status: synced, pending, conflict, or unsynced. Users should always know what’s happening.</p>

<h3 id="the-domain-model">The domain model</h3>

<p>To support the above, I created the following core types:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// A local document</span>
<span class="kr">interface</span> <span class="nx">Document</span> <span class="p">{</span>
  <span class="nl">id</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">name</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">contentHash</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="c1">// SHA-256 of the document</span>
  <span class="nl">modifiedAt</span><span class="p">:</span> <span class="kr">number</span><span class="p">;</span>
  <span class="c1">// ...</span>
<span class="p">}</span>

<span class="c1">// A reference to a docume in a cloud drive or storage</span>
<span class="kr">interface</span> <span class="nx">RemoteRef</span> <span class="p">{</span>
  <span class="nl">provider</span><span class="p">:</span> <span class="dl">'</span><span class="s1">google-drive</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">apple-icloud</span><span class="dl">'</span><span class="p">;</span>
  <span class="nl">providerFileId</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">etag</span><span class="p">?:</span> <span class="kr">string</span><span class="p">;</span> <span class="c1">// The cloud's version identifier</span>
<span class="p">}</span>

<span class="c1">// A queued operation</span>
<span class="kr">interface</span> <span class="nx">Operation</span> <span class="p">{</span>
  <span class="nl">id</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="c1">// UUID for idempotency</span>
  <span class="nl">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">create</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">update</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">delete</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">move</span><span class="dl">'</span><span class="p">;</span>
  <span class="nl">localId</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">retryCount</span><span class="p">:</span> <span class="kr">number</span><span class="p">;</span>
<span class="p">}</span>

<span class="c1">// The sync state for a provider</span>
<span class="kr">interface</span> <span class="nx">SyncState</span> <span class="p">{</span>
  <span class="nl">provider</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">changeCursor</span><span class="p">?:</span> <span class="kr">string</span><span class="p">;</span> <span class="c1">// Delta token for incremental sync</span>
  <span class="nl">pendingOps</span><span class="p">:</span> <span class="nx">Operation</span><span class="p">[];</span>
<span class="p">}</span>

</code></pre></div></div>

<p>One key thing to note is the <code class="language-plaintext highlighter-rouge">changeCursor</code>. Both Google Drive and iCloud support the concept of a delta token. Delta token is a cursor that lets you ask <em>“what changed since last time?”</em> instead of listing every file on every sync. This is the difference between syncing 3 files and re-scanning 3,000.</p>

<h3 id="the-provider-interface">The provider interface</h3>

<p>I created a contract that every cloud provider must implement. As I add new providers (e.g. Dropbox), each one must conform to <code class="language-plaintext highlighter-rouge">DriveProvider</code>.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">interface</span> <span class="nx">DriveProvider</span> <span class="p">{</span>
  <span class="nl">id</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span>
  <span class="nl">platforms</span><span class="p">:</span> <span class="p">(</span><span class="dl">'</span><span class="s1">ios</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">android</span><span class="dl">'</span><span class="p">)[];</span> <span class="c1">// iCloud is iOS-only</span>

  <span class="c1">// Auth</span>
  <span class="nx">authenticate</span><span class="p">():</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="nx">AuthResult</span><span class="o">&gt;</span><span class="p">;</span>

  <span class="c1">// The delta API (critical for efficiency)</span>
  <span class="nx">getChanges</span><span class="p">(</span><span class="nx">cursor</span><span class="p">:</span> <span class="kr">string</span><span class="p">):</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="nx">ChangeResult</span><span class="o">&gt;</span><span class="p">;</span>

  <span class="c1">// CRUD operations</span>
  <span class="nx">upload</span><span class="p">(</span><span class="nx">params</span><span class="p">:</span> <span class="nx">UploadParams</span><span class="p">):</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="nx">RemoteRef</span><span class="o">&gt;</span><span class="p">;</span>
  <span class="nx">download</span><span class="p">(</span><span class="nx">ref</span><span class="p">:</span> <span class="nx">RemoteRef</span><span class="p">,</span> <span class="nx">destination</span><span class="p">:</span> <span class="kr">string</span><span class="p">):</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="k">void</span><span class="o">&gt;</span><span class="p">;</span>
  <span class="k">delete</span><span class="p">(</span><span class="nx">ref</span><span class="p">:</span> <span class="nx">RemoteRef</span><span class="p">):</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="k">void</span><span class="o">&gt;</span><span class="p">;</span>

  <span class="c1">// Storage quota</span>
  <span class="nx">getQuota</span><span class="p">():</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="p">{</span> <span class="na">used</span><span class="p">:</span> <span class="kr">number</span><span class="p">;</span> <span class="nl">limit</span><span class="p">?:</span> <span class="kr">number</span> <span class="p">}</span><span class="o">&gt;</span><span class="p">;</span>
<span class="p">}</span>

</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">platforms</code> field is important. iCloud uses a native iOS API (the <em><a href="https://www.objc.io/issues/10-syncing-data/icloud-document-store/">Ubiquity Container</a></em>), so it simply doesn’t exist on Android. <code class="language-plaintext highlighter-rouge">platforms</code> gives me a way to control platform-specific availability and features.</p>

<p>Here’s how it’s used:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">function</span> <span class="nx">getPlatformProviders</span><span class="p">():</span> <span class="nx">DriveProvider</span><span class="p">[]</span> <span class="p">{</span>
  <span class="k">return</span> <span class="nx">allProviders</span><span class="p">.</span><span class="nx">filter</span><span class="p">(</span><span class="nx">p</span> <span class="o">=&gt;</span> <span class="nx">p</span><span class="p">.</span><span class="nx">platforms</span><span class="p">.</span><span class="nx">includes</span><span class="p">(</span><span class="nx">Platform</span><span class="p">.</span><span class="nx">OS</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>On Android, users see only Google Drive in settings. On iOS, they see both Google Drive and iCloud.</p>

<h3 id="building-google-drive-sync-feature--rest-api-cloud-provider">Building Google Drive Sync Feature : REST API Cloud Provider</h3>

<p>Starting with the Google Drive integration, I quickly learned that its implementation is very different from iCloud’s.</p>

<p><strong>Scoped access</strong>
 To access Google Drive, the app requests the <code class="language-plaintext highlighter-rouge">drive.scope</code>, which limits access to files the app itself created. The app can never see a user’s vacation photos or tax documents.</p>

<p><strong>The delta API</strong>
 Google’s <code class="language-plaintext highlighter-rouge">changes.list</code> endpoint accepts a <code class="language-plaintext highlighter-rouge">pageToken</code> and returns only files that have changed since that token. On the first sync, I list everything. On subsequent syncs, I might get an empty response—meaning nothing changed. This is massively more efficient than re-listing thousands of files.</p>

<p><strong>Conflict detection via ETag</strong>
 Every file has an <code class="language-plaintext highlighter-rouge">etag</code>, which acts as a version identifier. When uploading, the app sends:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PUT /upload/drive/v3/files/<span class="o">{</span>fileId<span class="o">}</span>
If-Match: <span class="s2">"existing-etag"</span>

</code></pre></div></div>

<p>If the <code class="language-plaintext highlighter-rouge">etag</code> doesn’t match (for example, if the user edited the file elsewhere), Google returns 412 Precondition Failed. The app catches this and routes the document to a conflict resolver.</p>

<p><strong>Resumable uploads</strong>
 For files over 5 MB, I used Google’s resumable upload protocol. This means if the connection drops at 80%, the app resumes from 80%—not from 0%.</p>

<h3 id="building-icloud-sync-the-filesystem-provider">Building iCloud Sync: The Filesystem Provider</h3>

<p>iCloud was a completely different beast. It’s not a REST API. It’s a synchronized filesystem managed by iOS itself.</p>

<h4 id="the-ubiquity-container">The ubiquity container</h4>

<p>iCloud gives each app a special folder called the <em>Ubiquity Container</em>. Files you put there are automatically synced by iOS in the background. To “upload” a file, you simply copy it into this folder:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="nv">source</span> <span class="o">=</span> <span class="n">localDocumentURL</span>
<span class="k">let</span> <span class="nv">destination</span> <span class="o">=</span> <span class="n">ubiquityContainerURL</span><span class="o">.</span><span class="nf">appendingPathComponent</span><span class="p">(</span><span class="s">"document.pdf"</span><span class="p">)</span>

<span class="k">try</span> <span class="kt">FileManager</span><span class="o">.</span><span class="k">default</span><span class="o">.</span><span class="nf">copyItem</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">source</span><span class="p">,</span> <span class="nv">to</span><span class="p">:</span> <span class="n">destination</span><span class="p">)</span>

<span class="c1">// iOS handles the rest</span>
</code></pre></div></div>

<p>This sounds straightforward, but it’s not quite that simple. Getting entitlements set up is very different from how Google Drive works.</p>

<h4 id="the-config-plugin">The config plugin</h4>

<p>Since I’m building the app in React Native, it doesn’t know about iCloud entitlements. I had to build an Expo Config Plugin (a big thanks to Claude Code here) that:</p>

<ul>
  <li>Enables the iCloud capability in Xcode</li>
  <li>Adds the correct entitlements to the app</li>
  <li>Exposes the ubiquity container path to JavaScript</li>
</ul>

<h4 id="no-delta-api">No delta API</h4>

<p>Unlike Google Drive, iCloud has no cursor-based delta API. Instead, I used <code class="language-plaintext highlighter-rouge">NSMetadataQuery</code> to watch the container for changes:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="nv">query</span> <span class="o">=</span> <span class="kt">NSMetadataQuery</span><span class="p">()</span>
<span class="n">query</span><span class="o">.</span><span class="n">predicate</span> <span class="o">=</span> <span class="kt">NSPredicate</span><span class="p">(</span><span class="nv">format</span><span class="p">:</span> <span class="s">"%K LIKE '*'"</span><span class="p">,</span> <span class="kt">NSMetadataItemPathKey</span><span class="p">)</span>
<span class="n">query</span><span class="o">.</span><span class="n">searchScopes</span> <span class="o">=</span> <span class="p">[</span><span class="kt">NSMetadataQueryUbiquitousDocumentsScope</span><span class="p">]</span>
<span class="n">query</span><span class="o">.</span><span class="nf">start</span><span class="p">()</span>
</code></pre></div></div>

<p>This gives live notifications when files change — more real-time than polling, but also harder to implement.</p>

<h3 id="the-sync-engine">The sync engine</h3>

<p>With providers abstracted, I built a sync engine that:</p>

<ul>
  <li>Transforms local changes into operations
    <ul>
      <li>Document saved → create or update operation queued</li>
      <li>Document deleted → delete operation queued</li>
    </ul>
  </li>
  <li>Persists the operation queue
    <ul>
      <li>Stored in AsyncStorage as a “sync journal”</li>
      <li>Survives app crashes and restarts</li>
    </ul>
  </li>
  <li>Executes operations with retry logic
    <ul>
      <li>Transient failures: exponential backoff (1s → 2s → 4s → 8s…)</li>
      <li>Rate limits: respect <code class="language-plaintext highlighter-rouge">Retry-After</code> headers</li>
      <li>Auth failures: pause sync and prompt re-authentication</li>
    </ul>
  </li>
  <li>Detects and routes conflicts
    <ul>
      <li>ETag mismatch → conflict state</li>
      <li>User resolves: keep local, keep remote, or keep both</li>
    </ul>
  </li>
</ul>

<h3 id="the-sync-journal">The sync journal</h3>

<p>Every operation gets a UUID for idempotency. If the app crashes after uploading but before updating local metadata, it can safely replay the operation. The cloud will recognize it as a duplicate.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">interface</span> <span class="nx">SyncJournal</span> <span class="p">{</span>
  <span class="nl">providers</span><span class="p">:</span> <span class="p">{</span>
    <span class="p">[</span><span class="na">providerId</span><span class="p">:</span> <span class="kr">string</span><span class="p">]:</span> <span class="p">{</span>
      <span class="nx">cursor</span><span class="p">?:</span> <span class="kr">string</span><span class="p">;</span>
      <span class="nl">pendingOps</span><span class="p">:</span> <span class="nx">Operation</span><span class="p">[];</span>
      <span class="nl">failedOps</span><span class="p">:</span> <span class="nx">Operation</span><span class="p">[];</span>
    <span class="p">};</span>
  <span class="p">};</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">failedOps</code> array is crucial. After five retries, I stop retrying and surface the error to the user.</p>

<h3 id="handling-deletion-in-the-cloud">Handling deletion in the cloud</h3>

<p>Deletion has a dangerous edge case: a remote delete can destroy user data.</p>

<p>Imagine this flow:</p>

<ol>
  <li>User links Google Drive</li>
  <li>All documents sync correctly</li>
  <li>User opens Google Drive on the web and deletes the Keeplys folder</li>
  <li>App syncs and sees files are “deleted” remotely</li>
  <li>App mirrors the delete locally</li>
  <li>User’s documents are gone forever</li>
</ol>

<p>I handled this with a <strong>soft-delete with confirmation</strong>:</p>

<table>
  <thead>
    <tr>
      <th>Scenario</th>
      <th>Behavior</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>User deletes locally</td>
      <td>Remote file deleted immediately</td>
    </tr>
    <tr>
      <td>Remote file deleted</td>
      <td>Local file marked <em>unsynced</em>, NOT deleted</td>
    </tr>
    <tr>
      <td>User confirms</td>
      <td>Local file permanently deleted</td>
    </tr>
  </tbody>
</table>

<h3 id="handling-the-merge-problem">Handling the merge problem</h3>

<p>All data backups are synced in a folder named “Keeplys”. One edge case I needed to handle was a case where a user links a provider and there’s already a <code class="language-plaintext highlighter-rouge">/Keeplys</code> folder in their cloud drive. Maybe from another device, maybe from a previous install.</p>

<p>I used hash-based deduplication as follows:</p>

<ul>
  <li>List all remote files</li>
  <li>For each local file, compute a SHA-256 hash</li>
  <li>Compare local and remote files by name + hash</li>
</ul>

<table>
  <thead>
    <tr>
      <th>Local</th>
      <th>Remote</th>
      <th>Action</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>File exists</td>
      <td>Same hash</td>
      <td>Link (no transfer)</td>
    </tr>
    <tr>
      <td>File exists</td>
      <td>Different hash</td>
      <td>Conflict</td>
    </tr>
    <tr>
      <td>File exists</td>
      <td>Missing</td>
      <td>Upload</td>
    </tr>
    <tr>
      <td>Missing</td>
      <td>File exists</td>
      <td>Download</td>
    </tr>
  </tbody>
</table>

<p>This handles the common case of reinstalling the app on a new phone without re-uploading gigabytes of documents.</p>

<h3 id="handling-sync-in-the-background">Handling sync in the background</h3>

<p>I didn’t want users to open the app just to sync. I implemented background sync using <code class="language-plaintext highlighter-rouge">expo-background-fetch</code>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">BackgroundFetch</span><span class="p">.</span><span class="nx">registerTaskAsync</span><span class="p">(</span><span class="dl">'</span><span class="s1">KEEPLYS_BACKGROUND_SYNC</span><span class="dl">'</span><span class="p">,</span> <span class="p">{</span>
  <span class="na">minimumInterval</span><span class="p">:</span> <span class="mi">15</span> <span class="o">*</span> <span class="mi">60</span><span class="p">,</span> <span class="c1">// iOS minimum: 15 minutes</span>
  <span class="na">stopOnTerminate</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
  <span class="na">startOnBoot</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
<span class="p">});</span>
</code></pre></div></div>

<p>On iOS, this wraps <code class="language-plaintext highlighter-rouge">BGAppRefreshTask</code>. On Android, it wraps <code class="language-plaintext highlighter-rouge">WorkManager</code>. Both respect battery optimization and network constraints.</p>

<p>I also added a <strong>“Sync on Wi-Fi only”</strong> toggle. When enabled, background sync checks <code class="language-plaintext highlighter-rouge">NetInfo.type === 'wifi'</code> before proceeding.</p>]]></content><author><name>Samuel James</name></author><category term="Mobile Development" /><category term="Software Engineering" /><category term="react-native" /><category term="sync-engine" /><category term="google-drive" /><category term="icloud" /><summary type="html"><![CDATA[Learn how to build a robust document sync engine for mobile apps with Google Drive and iCloud integration. Covers delta sync, conflict resolution, offline-first architecture, and background sync in React Native.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/uploads/Screenshot%202025-12-31%20at%2021.39.37.png" /><media:content medium="image" url="https://hubofco.de/uploads/Screenshot%202025-12-31%20at%2021.39.37.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">The core user trap and how optimising for them kills Inclusivity</title><link href="https://hubofco.de/2025/11/14/the-core-user-trap-and-how-optimization-can-kill-inclusivity/" rel="alternate" type="text/html" title="The core user trap and how optimising for them kills Inclusivity" /><published>2025-11-14T07:22:00+00:00</published><updated>2025-11-14T07:22:00+00:00</updated><id>https://hubofco.de/2025/11/14/the-core-user-trap-and-how-optimization-can-kill-inclusivity</id><content type="html" xml:base="https://hubofco.de/2025/11/14/the-core-user-trap-and-how-optimization-can-kill-inclusivity/"><![CDATA[<p><img src="https://substackcdn.com/image/fetch/$s_!LTjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249fae5a-1c29-4f6a-885c-1f281e99fe0c_4135x1575.png" alt="" /></p>

<p>Over the past few years, I’ve led growth product teams focusing on retention and growing daily active users (DAU). Working on a myriad of challenges aimed at widening access to a global audience has deeply reshaped my understanding of inclusivity and what it truly means to build for a global audience and leave no one behind.</p>

<p>For a long time, I had a narrow view of inclusivity. I saw it as a nice-to-have, rather than a holistic principle that should influence many engineering decisions. Through experience, I’ve learned that even well-intentioned technical trade-offs can unintentionally exclude certain potential users.</p>

<p>Think of it. Most products start small. They begin as simple ideas, gradually evolve, and eventually find market fit. At that point, the business identifies its core users—the cohort that drives the most value and gives the most feedback. Naturally, the focus shifts toward serving them better.</p>

<p>Teams start listening closely to these users’ pain points, prioritizing them, building new features to address their needs, and tailoring the product experience to them. This approach often feels smart and efficient. After all, it follows the familiar <strong>“80/20 rule”</strong>—what 20% of effort can deliver 80% of the impact?</p>

<p>It seems like a no-brainer to prioritize the 20% of users who drive most, or should I say 80% of the revenue. But over time, this focus creates an unintended consequence: the remaining 80% of users are quietly left behind. They’re excluded.</p>

<p>They might still show up occasionally—out of curiosity or necessity—but each time they do, the product feels a little more distant, a little less for them. The product speaks a quiet truth: <em>“I know you need me, but I wasn’t really built for you.”</em></p>

<p>Too often, we default to <em>“design and build for core users”</em> or for users with fewer constraints because it promises better ROI. But the deeper, more important questions are:</p>

<ul>
  <li>
    <p>What’s standing in the way of users who should be using your product but aren’t?</p>
  </li>
  <li>
    <p>Are your design choices excluding people before they even have a chance to become “core users”?</p>
  </li>
</ul>

<p>Throughout my career, I’ve often heard (and said) statements like:</p>

<p><em>“Let’s prioritize the core users who matter most.”</em></p>

<p>At face value, these statements seem reasonable—and in many cases, they are. Prioritizing core users often feels like a sound, data-driven decision. But it becomes problematic when the entire development process is optimized solely around these users. Over time, the product begins to diverge, unintentionally excluding minority or edge users. The deeper we go down this path, the more self-reinforcing the cycle becomes.</p>

<h3 id="the-engineers-bubble"><strong>The Engineer’s Bubble</strong></h3>

<p>It’s not just in product decisions; it applies equally to engineering. The context we live in profoundly shapes how we build, how we think, and how we perceive reality.</p>

<p>I’m reminded of a story from one of my German classes. It described how everyone is born wearing an invisible pair of glasses through which they view the world. Some have red lenses, others blue, and so on. Each pair of glasses colors our perception—what we see feels real and objective, but it’s deeply shaped by our lens.</p>

<p>The same is true for us as engineers. We build from within our own cultural and technological bubbles, assuming our experiences and environments are universal—but they’re not. Our daily context is one of privilege:</p>

<ul>
  <li>
    <p>We use high-end laptops and modern devices.</p>
  </li>
  <li>
    <p>We work with fast, reliable internet and constant power.</p>
  </li>
  <li>
    <p>We test our applications on the latest browsers and operating systems.</p>
  </li>
  <li>
    <p>We are early adopters of new technologies and easily embrace change.</p>
  </li>
</ul>

<p>A few years ago, when React was becoming mainstream, my team was thrilled to migrate our legacy app to this shiny new framework. We couldn’t stop talking about how much better the developer experience would be—faster builds, cleaner code, modern patterns. We imagined the productivity gains and the delight of finally using something elegant and efficient.</p>

<p>We rebuilt core parts of the product, added sleek new features, and gave the UI a fresh, modern look. Everything ran beautifully in Chrome, Firefox, and every modern browser. I remember thinking to myself, “How could anyone not love this?” I couldn’t wait for customers to experience it.</p>

<p>Then we launched.</p>

<p>We checked the logs and usage metrics; excitement quickly turned to confusion. Adoption was down, and a sizable portion of our users weren’t making it past the first few screens. What went wrong?</p>

<p>It didn’t take long to find the answer. It was Internet Explorer. They were using that specific browser. We had tested on every major browser except Internet Explorer. Some of our users were still on IE 8 and 9. The app broke completely for them.</p>

<p>We had assumed, without question, that “everyone” used modern browsers by now. But we were wrong.</p>

<p>That experience was a wake-up call. It reminded me that our reality as engineers is not the same as our users’ reality. The tools, devices, and networks we take for granted are luxuries many don’t know or have access to.</p>

<p>A large portion of the world still uses entry-level devices with limited processing power. Many experience intermittent connectivity or rely on expensive, slow internet in their day to day. When we’re not aware of this reality, our products become less and less usable to certain users who should be using them.</p>

<p>Inclusive product engineering is about widening access. It’s about being intentional in how we design and build software. It’s about writing code that runs efficiently and work for all users. It’s about optimizing yours so it has low battery, data, and memory usage. It’s about making sure no one is left behind.</p>

<p>With constant pressure to ship features quickly, inclusive software design can sometimes feel expensive or time-consuming. But the truth is, inclusion doesn’t always require much from the start. It requires more awareness, discipline, and intentionality from us.</p>

<p>How can you start making your product more accessible to the world?</p>

<h3 id="start-with-awareness"><strong>Start with awareness</strong></h3>

<ul>
  <li>
    <p>Begin with questions, not code. Challenge assumptions early and often.</p>
  </li>
  <li>
    <p>Support accessibility from the start.</p>
  </li>
  <li>
    <p>Who are you building for, how diverse are they, and how do they engage in the digital world?</p>
  </li>
  <li>
    <p>Who might this exclude?</p>
  </li>
  <li>
    <p>If you’re improving an existing feature, where are users dropping off—and why?</p>
  </li>
  <li>
    <p>What happens if the user has a slower device or poor connectivity?</p>
  </li>
</ul>

<h3 id="test-your-product-under-different-constraints"><strong>Test your product under different constraints</strong></h3>

<p>Just as automated tests shouldn’t stop at the “happy path,” feature testing shouldn’t either.</p>

<ul>
  <li>
    <p>Throttle network speeds to simulate unreliable internet. Does your app handle it gracefully?</p>
  </li>
  <li>
    <p>Test on older, entry-level devices or budget phones.</p>
  </li>
  <li>
    <p>Use accessibility tools like color-blindness simulators or voiceover modes to check usability.</p>
  </li>
  <li>
    <p>How does the app work with LTR and RTL languages?</p>
  </li>
</ul>

<h3 id="design-and-code-with-constraints-in-mind"><strong>Design and code with constraints in mind</strong></h3>

<ul>
  <li>
    <p>Use lightweight assets and reduce dependency on heavy libraries.</p>
  </li>
  <li>
    <p>Prioritize clarity and the primary jobs users need to get done.</p>
  </li>
  <li>
    <p>Respect data-saving mode in your code.</p>
  </li>
</ul>

<h3 id="involve-diverse-voices-early"><strong>Involve diverse voices early</strong></h3>

<p>Encourage your research and design teams to broaden their participant groups and ensure diverse representation.</p>

<p>If that’s not possible, ask for feedback from teammates in different regions, on different devices, or with different cultural perspectives. Even small variations in context can surface assumptions you didn’t realize you were making.</p>

<hr />

<p>In conclusion, focusing on the “core 20%” might feel efficient in the short term, but it risks alienating the very users who could expand your product’s reach and impact. As engineers, our code shapes the world. Making the world more equitable begins by stepping outside our bubble, questioning our assumptions, and designing for the full spectrum of users—not just the ones who look, think, or live like us.</p>]]></content><author><name>Samuel James</name></author><category term="inclusive engineering" /><category term="building for the billions" /><category term="building for a global audience" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/$s_!LTjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249fae5a-1c29-4f6a-885c-1f281e99fe0c_4135x1575.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/$s_!LTjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F249fae5a-1c29-4f6a-885c-1f281e99fe0c_4135x1575.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to Get Good at Giving Feedback: A Guide for Engineering Leaders</title><link href="https://hubofco.de/engineering%20leadership/software%20engineering/2024/03/02/how-to-get-good-at-giving-feedback/" rel="alternate" type="text/html" title="How to Get Good at Giving Feedback: A Guide for Engineering Leaders" /><published>2024-03-02T17:35:00+00:00</published><updated>2024-03-02T00:00:00+00:00</updated><id>https://hubofco.de/engineering%20leadership/software%20engineering/2024/03/02/how-to-get-good-at-giving-feedback</id><content type="html" xml:base="https://hubofco.de/engineering%20leadership/software%20engineering/2024/03/02/how-to-get-good-at-giving-feedback/"><![CDATA[<p><img src="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78e9058f-62bb-4180-bdd4-15ed3345bbf1_2090x443.png" alt="" />
When we touch a hot stove, we quickly remove our hands and learn never to touch it again when it’s hot.</p>

<p>We all learn through feedback, from the searing pain of a hot stove to the warm praise of a job well done. It’s the cornerstone of growth, both in life and especially in the world of work.<br />
As a senior engineer or tech lead, feedback is one of the most effective tools in your toolbox to help level up people around you.</p>

<p>But how do you give feedback effectively? For many, it’s shrouded in misconceptions and uncomfortable emotions.</p>

<p>It took me some time to understand some misconceptions I held about giving feedback. As an example, I once asked a report in a 1:1, What change can I make now to better support you? He responded, “I would love to receive more feedback from you.” I thought inwardly that I’m always on the lookout to see where I can give you feedback. You’re doing excellently well, and I can hardly find areas to suggest improvement.</p>

<p>One of the common misconceptions about giving feedback is that it should be given only when an opportunity for improvement is spotted. And I held on to that misconception for a while.</p>

<h2 id="ditch-the-myths">Ditch the myths</h2>

<p>Let’s debunk some myths about giving feedback and explore the four simple steps to getting good at giving feedback.</p>

<p><strong>Myth 1: Feedback is only for negative corrections.</strong> Not true! Feedback celebrates strengths, reinforces positive behaviors, and guides future actions; it is not always about fixing mistakes.</p>

<p><strong>Myth 2: Instant feedback is always best</strong>. Emotions run high in the moment, leading to potentially hurtful and unproductive exchanges. Wait until you’re calm and collected.</p>

<p><strong>Myth 3: Sandwiching is the way to go</strong>. Sugarcoating feedback dilutes its impact. Be direct, yet respectful.</p>

<p><strong>Myth 4: You need a laundry list of examples.</strong> You don’t need a laundry list of examples to make your feedback effective. Focus on one or two key observations for clarity and impact.</p>

<p><strong>Myth 5: Feedback is a one-way street.</strong> It’s a conversation! Encourage questions, active listening, and shared understanding.</p>

<p><strong>Myth 6: Feedback flows top-down only</strong>. <a href="https://softwareleads.substack.com/p/how-to-build-a-high-performing-software">High-performing teams</a> have feedback flowing everywhere: leader-to-team, team-to-leader, and peer-to-peer.</p>

<p>Believing that you have to give feedback instantly can make you give feedback when you don’t have your emotions in check. You want to avoid giving feedback when you’re angry because it’s far more likely the recipient will feel hurt or judged—and thus defensive.</p>

<p>Feedback shouldn’t be given only when you want a change from the receiver. Feedback is not only about correcting mistakes and driving change but also about recognizing positive contributions and strengths and reinforcing positive behaviors.</p>

<p>Feedback is a two-way dialogue where both the giver and the receiver share their perspectives, ask questions, listen actively, and seek to understand each other.</p>

<p>Anyone can be both a giver and a receiver of feedback. It should not be a top-down thing. In a high-performing organization, feedback flows from top down (from managers to reports), down up (from reports to managers), and laterally (from peers to peers).</p>

<h2 id="four-simple-steps-to-get-better-at-giving-feedback">Four simple steps to get better at giving feedback.</h2>

<p>Follow these four steps to offer better feedback.</p>

<h3 id="1-ask">1. Ask</h3>

<p>Before diving in, ensure the recipient is open to hearing your thoughts. When you ask, it gives the receiver control, ensures that you have the attention of the person, and sets expectations for what the conversation will be about.</p>

<p>A simple question like “Can I share some feedback with you?” can do wonders.</p>

<h3 id="2-describe-the-behavior-or-action">2. Describe the behavior or action</h3>

<p>Avoid judging intentions. As a feedback giver, your job is not to judge the intentions of others. You’re not in a position to put a label on the action. Your role is to express the impact their actions had on you, the team, or the project.</p>

<p>When feedback is focused on the past, the other party can’t do anything about it; defensiveness on the side of the receiver is sometimes inevitable. Good feedback should focus less on the past and more on the future, making the receiver feel that the purpose of the conversation is for the future.</p>

<p>The book <a href="https://www.manager-tools.com/products/effective-manager-book-second-edition">Effective Managers</a> describes a simple way of describing actions that I found useful. It starts with the “when” question. For example:</p>

<ul>
  <li>
    <p>“When you took the initiative to…”</p>
  </li>
  <li>
    <p>“When you presented today…”</p>
  </li>
  <li>
    <p>“When you don’t communicate progress like you did yesterday…”</p>
  </li>
  <li>
    <p>“When you spoke about…”</p>
  </li>
  <li>
    <p>“When you stay an extra hour to find the root cause,”</p>
  </li>
  <li>
    <p>“When you respond politely after the customer…”</p>
  </li>
  <li>
    <p>“When you were presenting your ideas…”</p>
  </li>
</ul>

<h3 id="3-highlight-the-impact">3. Highlight the impact</h3>

<p>Once you have described the action, proceed to explain how the behavior affected you, the team, or the project. For example, when you did that to XYZ, this is the impact (insert impact) it had on the team or the project.</p>

<p>Example:”</p>

<p><strong>Behavior</strong>: When you took the initiative to…</p>

<p><strong>Impact</strong>: This helped the team move forward efficiently and be able to deliver on time.</p>

<h3 id="4-encourage-future-action">4. Encourage future action</h3>

<p>If the feedback is negative, suggest what could be done differently in the future. If it’s positive, express appreciation and encourage continued behavior.</p>

<h2 id="building-trust-is-key">Building trust is key</h2>

<p>Trust is like the air we breathe; when it’s present, nobody really notices; when it’s absent, everybody notices. – Warren Edward Buffett.</p>

<p>When people trust the giver, they’re more receptive and open to learning. Invest time in building genuine connections with your team. This fosters an environment where feedback is seen as a tool for growth, not criticism.</p>

<p>In conclusion, giving feedback is a skill that can be learned, not a magic trick. By practicing these steps and fostering a culture of trust, you can unlock the true power of feedback.</p>]]></content><author><name>Samuel James</name></author><category term="Engineering Leadership" /><category term="Software Engineering" /><category term="feedback" /><category term="mentoring" /><category term="leadership" /><category term="team-management" /><category term="engineering-management" /><summary type="html"><![CDATA[Master the art of giving effective feedback as a tech lead or engineering manager. Learn the four-step feedback framework, debunk common myths, and build a culture of trust for high-performing teams.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78e9058f-62bb-4180-bdd4-15ed3345bbf1_2090x443.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78e9058f-62bb-4180-bdd4-15ed3345bbf1_2090x443.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">3 areas to drive clarity for sustained high-performance in teams</title><link href="https://hubofco.de/engineeringmanagement/2024/02/06/3-areas-to-drive-clarity-for-sustained-high-performance-in-teams/" rel="alternate" type="text/html" title="3 areas to drive clarity for sustained high-performance in teams" /><published>2024-02-06T07:40:00+00:00</published><updated>2024-02-06T07:40:00+00:00</updated><id>https://hubofco.de/engineeringmanagement/2024/02/06/3-areas-to-drive-clarity-for-sustained-high-performance-in-teams</id><content type="html" xml:base="https://hubofco.de/engineeringmanagement/2024/02/06/3-areas-to-drive-clarity-for-sustained-high-performance-in-teams/"><![CDATA[<p><em>Great leaders lead by distilling the why and the what for their teams, peers, and organization, and by overcommunicating to help drive alignment, not through control – Satya Nadella</em></p>

<p><a href="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fb42724-86eb-4d94-b030-4a59956b0339_962x674.png"></a></p>

<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fb42724-86eb-4d94-b030-4a59956b0339_962x674.png" alt="" /></p>

<p>Have you ever been on a team that’s stuck? A team where everyone is moving in different directions? You spoke to three people, and there is a lack of consensus on what the team is trying to achieve. There are different perceptions of the team’s priorities and direction. Priorities shift from time to time, and you could feel a sense of instability or unpredictability in the team’s work. If you stick a little longer, you start to witness signs of disengagement, low morale, or a lack of enthusiasm.</p>

<p>If you notice several of the above, it’s an indication that the team lacks clarity of direction. Where there is no clarity, a different narrative ensues. If you’re a manager, one of the most important things you can do is drive clarity of purpose, direction, plan, and responsibilities.</p>

<p>I have come to believe that clarity in the context of software development is about giving people a practical North Star to guide their thinking and actions in the right direction, even when the conditions get cloudy. I have to admit, it took me a while to learn how important this is when you lead engineering teams.</p>

<p>A team is a function of its environment. If the environment a team is operating in lacks clarity on its purpose, structure, expectations, and boundaries, it becomes really difficult to achieve sustained high performance.</p>

<p>Clarity provides the vision, the direction, the rallying cry, the strategy, and the goals of your business and team. Clarity grounds organisations and teams into something bigger.</p>

<h2 id="why-does-driving-clarity-matter">Why does driving clarity matter?</h2>

<ul>
  <li>
    <p>A collective sense of clarity of purpose within an organization has a direct impact on its performance.</p>
  </li>
  <li>
    <p>Clarity drives focus. There’s a reason why startups can move swiftly and outpace larger organizations with significantly more resources. Besides the absence of bureaucracy, successful startups embody a strong sense of purpose and instill a radical focus in every team member.</p>
  </li>
  <li>
    <p>A lack of clarity impacts commitment. If it’s not clear, it’s hard to commit to.</p>
  </li>
  <li>
    <p>Clarity drives behaviors and actions that propel organizations forward, all aligned toward common goals.</p>
  </li>
</ul>

<p>There are three major areas a leader should consistently drive clarity on to get a team to their peak performance.</p>

<h2 id="clarity-of-direction-and-purpose">Clarity of direction and purpose</h2>

<p>Charles Garfield, author of several books on peak performance, spent decades studying peak performers and found that intense commitment to what they do is one of the single most dramatic differences between peak performers and their less productive colleagues.</p>

<p>Having hired tens of software engineers and interviewed dozens of software engineers, I have found that engineers who are passionate about what they do and the problems they solve perform better and retain better than those who are not.</p>

<p>What really makes people develop intense commitment and passion for what they do? It is rooted in a deep understanding of “purpose.” An intense attachment to why they’re doing that thing.</p>

<p>The differentiation of performance in engineering teams is how the members feel about what they’re doing. The “how” stems from the purpose. To better put, the “why” behind that thing.</p>

<p>Effective leaders master how to communicate purpose and direction that inspire others and provoke actions.</p>

<p>McKinsey <a href="https://www.mckinsey.com/capabilities/people-and-organizational-performance/our-insights/high-performing-teams-a-timeless-leadership-topic">asked</a> over 5000 executives to write down their peak experience as team members and the descriptions of the environment. The first of the three things was alignment in direction, which is a shared belief about what the organization is striving towards.</p>

<p>When there is clarity of purpose, a team has a navigation aid that helps set the direction of collective actions.</p>

<p>Clarity of purpose stems from everyone on the team understanding why they’re doing whatever they’re doing in the first place.</p>

<p>A northstar to aim for is a crucial part of motivating your team. When a team can’t understand why they’re doing what they’re doing, it becomes a losing battle. A small litmus test to understand if a team has clarity of purpose is:</p>

<ul>
  <li>
    <p>Does the team understand where the company is headed?</p>
  </li>
  <li>
    <p>Does the team understand why the company is headed there?</p>
  </li>
  <li>
    <p>Does the team understand how they’re contributing to that?</p>
  </li>
  <li>
    <p>Does the team understand how success is measured?</p>
  </li>
</ul>

<p>Clarity of purpose drives strategy and resources. To formulate a solid plan, you first need to be clear about what you want to achieve.</p>

<h2 id="clarity-of-plan"><strong>Clarity of plan</strong></h2>

<p>When you don’t have a plan, randomness ensues. A team member pushes a new activity, and another team member pushes an unrelated task on the side.</p>

<p>There are activities and tasks being done. But the effectiveness of the collective actions falls short. You wonder why the impact was so low. You don’t need to look far to notice an absence of clarity in the destination and the plan to get there.</p>

<p>When you have a plan, it limits your choices and helps people focus on the specific path to get to the intended destination.</p>

<p>Having a plan is the beginning. The end is to get to a place where everyone knows the plan, what they’re to do in the plan, and how that thing they’ll do contributes toward reaching their destination.</p>

<p>A team has clarity of plan when they’re able to understand the pieces and how those pieces will help them get to the destination.</p>

<p>To create plan clarity, start by connecting your overall mission (North Star) to actionable steps that get you to your destination—and creatively distill that into your team.</p>

<p>There are a few steps to take to help you understand and have clarity about your plans.</p>

<ol>
  <li>
    <p>Together with your team, do focused planning by collectively mapping out the important pieces required to get to where you want to get to. People remember what they co-create better.</p>
  </li>
  <li>
    <p>Establish a set of measurable key results that you aim to achieve by specific dates.</p>
  </li>
  <li>
    <p>Map out the big projects that your team will take on to achieve those key results, and then the specific tasks to achieve them.</p>
  </li>
  <li>
    <p>Actively and consistently overcommunicate the plan.</p>
  </li>
</ol>

<h2 id="clarity-of-roles-and-responsibilities"><strong>Clarity of roles and responsibilities</strong></h2>

<p>In a soccer game, players play with a clear understanding of their roles. The goalkeeper knows his or her role. The strikers and defenders all play with a clear understanding of what role they have to play in the game. A lack of role clarity impedes high performance.</p>

<p>When I picked up my first job as a software engineer, I was handed a contract of engagement with clauses on my roles and a description of what I would be doing.</p>

<p>Months into the job, I saw how different my day-to-day job was from the 8 bullet points listed in the paper as a job description. There were implicit role expectations and responsibilities that were not in the job descriptions, as I learned the hard way.</p>

<p>When individuals in a team have role clarity, everyone knows what they’re responsible for and accountable for. They know the role that they have to play in the plan. They know what is expected, which tasks in the plan they have to accomplish, and how their performance on the tasks will be evaluated.</p>

<p>If you’re leading engineering teams, pay attention to whether individuals on the team have a full understanding of their roles and responsibilities in your overall plan.</p>

<p>You should consistently ask yourself:</p>

<ul>
  <li>
    <p>Does everyone understand their roles and what they have to contribute to the team’s plan?</p>
  </li>
  <li>
    <p>Do they understand the details of the results to be achieved?</p>
  </li>
</ul>

<p>If you’re an IC in a team and you lack role clarity, be proactive by asking your manager detailed questions about your role expectations, and your performance will be measured. Put the expectations in writing and send them to your manager for review.</p>]]></content><author><name>Samuel James</name></author><category term="EngineeringManagement" /><category term="high-performing" /><category term="engineeringmanager" /><category term="team-building" /><summary type="html"><![CDATA[Have you ever been on a team that's stuck? A team where everyone is moving in different directions? You spoke to three people, and there is a lack of consensus on what the team is trying to achieve. There are different perceptions of the team's priorities and direction. Priorities shift from time to time, and you could feel a sense of instability or unpredictability in the team's work. If you stick a little longer, you start to witness signs of disengagement, low morale, or a lack of enthusiasm. If you notice several of the above, it's an indication that the team lacks clarity of direction. Where there is no clarity, a different narrative ensues. If you're a manager, one of the most important things you can do is drive clarity of purpose, direction, plan, and responsibilities.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fb42724-86eb-4d94-b030-4a59956b0339_962x674.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fb42724-86eb-4d94-b030-4a59956b0339_962x674.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to empower your team through autonomy, mastery and purpose</title><link href="https://hubofco.de/engineeringmanagement/2023/11/22/how-to-empower-your-team-through-autonomy-mastery-and-purpose/" rel="alternate" type="text/html" title="How to empower your team through autonomy, mastery and purpose" /><published>2023-11-22T21:52:00+00:00</published><updated>2023-11-22T21:52:00+00:00</updated><id>https://hubofco.de/engineeringmanagement/2023/11/22/how-to-empower-your-team-through-autonomy-mastery-and-purpose</id><content type="html" xml:base="https://hubofco.de/engineeringmanagement/2023/11/22/how-to-empower-your-team-through-autonomy-mastery-and-purpose/"><![CDATA[<p><img src="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37afe623-fe1c-4a38-97a1-2fb88f9b5566_1644x644.png" alt="" /></p>

<p>A few months ago, I spoke at the WeAreDevelopers World Congress on <a href="https://speakerdeck.com/abiodunjames/building-high-performing-software-teams">building high-performing software teams</a>. The feedback from that talk has been overwhelming, and I’m thrilled it resonated with many people. In this post, I will unpack one of the key points I discussed: empowering your team and getting out of the way.</p>

<p>I like to think of empowerment as involving a team in decision-making, giving them a participatory role that leverages their skills and judgment and enhances their sense of individual worth and commitment     to the team’s goals.</p>

<p>When you begin leading a team, you might have the illusion that your role in the team is what keeps the lights on. You may even think that your decisions and skills alone are the primary factors responsible for getting the team to where it is. Nothing could be further from the truth. No one can lead a team or an organisation to success without the collective excellence of many.</p>

<p>The most effective teams I have seen are autonomous. Every team member works, contributes, and is equally committed to the team’s goals. The higher you advance in an organisation, the more you have to rely on diverse skills and talents in the people you’re leading to make your team thrive.</p>

<p>It’s no coincidence that the primary responsibilities of leaders include coaching, mentoring, and empowering those they lead, giving them the opportunity to take on significant initiatives, make mistakes, and continue to develop and grow along the way.</p>

<p>One of the fundamental changes you can make as a manager is reinforcing the belief that leadership is not just a function of the title someone holds on the team but that everyone can be a leader.</p>

<p>Empowering individuals on the team to lead and influence is crucial. If individuals on your team don’t believe they have the power to lead as you do, hierarchy sets in, as stated by Michael Loop in his book, “<a href="https://www.amazon.de/Art-Leadership-Small-Things-Done/dp/1492045691/">The Art of Leadership</a>.” This situation can lead people to feel they need to seek permission to drive change or wait for someone to dictate what to do.</p>

<p>For new managers, there are subtle ways you can disempower your team. I fell into one a few years ago when I began leading a team of engineers. I would jump into facilitating inter-team communications, chase down team dependencies, and shield my team from non-technical work.</p>

<p>As my team’s scope increased, it became increasingly challenging to handle all of these tasks alone.</p>

<p>I decided it was time for me to start involving my team by showing them how I handle these tasks. I thought I would demonstrate how it’s done so they could learn from it. Implicitly, I also expected them to contribute in the areas where I involved them.</p>

<p>I tried hard to lead by setting good examples. However, the more I tried to show them, the less I saw the team replicating what I was doing. In the meetings I involved them, they acted as observers rather than participants. I wanted them to be active contributors, but instead, they took a back seat and simply tagged along.</p>

<p>I realised that I had not only communicated my intent and expectations but had also not provided enough space for them to take on responsibilities and lead. Consequently, the more I took the lead, the more the team felt crowded out. They believed it was easier and faster for me to handle those tasks than for them. They felt like they had nothing to add, which increased their disempowerment.</p>

<p>The example from my experience illustrates subtle ways in which a team or reports can become disempowered. Instead, a leader’s goal should be to work themselves out of their job.</p>

<p><em>The goal of all leaders should be to work themselves out of a job – Jocko Willink.</em></p>

<p>Empowering engineers takes different forms in engineering organisations. Sometimes, it could mean providing autonomy, sharing the big picture, and giving your team all the details they need to make the right decisions.</p>

<p>At other times, it could mean refraining from providing your team with solutions and dictating what to build. Instead, give them the problems you want to solve and provide the space for them to find solutions.</p>

<p>In other instances, it could mean empowering engineers to solve problems rather than just completing tasks, and giving them ownership over it end-to-end.</p>

<h2 id="empower-your-team-through-autonomy-purpose-and-mastery">Empower your team through autonomy, purpose and mastery</h2>

<p>I often observe striking similarities between motivation and empowerment. When a team lacks the drive or motivation to take ownership of a piece of work for which they are responsible from start to finish, most of the time, it is a sign that the team has not been empowered.</p>

<p>Show me an intrinsically motivated team brimming with passion to tackle problems, and I will show you an empowered team.</p>

<p>Intrinsic motivation is behaviour driven by an internal desire to do something. It’s the motivation to engage that arises from within the individual rather than from the externals.</p>

<p>One of the ways to evoke intrinsic motivation is to provide autonomy. In the book “Drive,” intrinsic motivation is based on three factors: autonomy, mastery, and purpose. Tapping into the intrinsic motivation of individuals on your team leads to more sustainable performance and teamwork and consequently empowerment.</p>

<h3 id="give-your-team-the-right-amount-of-autonomy">Give your team the right amount of autonomy</h3>

<p>Autonomy is one’s ability to control one’s work or make decisions regarding one’s work. Empower your team by providing them with the freedom to choose how they approach their tasks, granting them a sense of ownership and accountability for the outcomes you want them to achieve.</p>

<p>You impede autonomy when you hand down feature requirements to your team without providing the context they need to comprehend the business problem and why they are doing what they’re doing.</p>

<p>A disempowered engineering team lacks the decision-making power regarding “what” they are building and “how” they are building it. The path to building an empowered team is to ensure that everyone has a say in what they are constructing and that they have the appropriate context that enables them to make effective micro-decisions.</p>

<h3 id="give-them-space-to-struggle-through-stretch-projects">Give them space to struggle through stretch projects</h3>

<p>We’ve all experienced the joy of learning something new: the excitement that follows learning a new language or the satisfaction of finally mastering how to solve a persistent problem. That feeling of continuous improvement on something that felt nearly impossible a month ago is a crucial source of motivation that can keep you going.</p>

<p>Empower individuals on your team by providing them with opportunities to work on tasks that stretch their abilities—tasks that take them slightly out of their comfort zone but offer growth prospects.</p>

<p>Is there a team member who hasn’t yet had the chance to develop their skills in handling cross-functional projects? Create an opportunity that exposes that engineer to cross-functional projects where they can acquire new skills and collaborate with a diverse group of colleagues.</p>

<p>When you don’t allow your team to take on challenging tasks or attempt to shield them from challenges, you deny them the chance to master those challenges. By doing so, you’re not only disempowering those individuals, and you might also be depriving them of coaching opportunities.</p>

<h3 id="give-them-a-sense-of-purpose">Give them a sense of purpose</h3>

<p>Purpose signifies understanding how one’s work contributes to something meaningful and impactful. When a team can connect their efforts to a larger purpose or mission, their motivation and engagement increase. To empower your team to make informed judgment calls, provide them with a sense of purpose, which offers the context they need to do so effectively.</p>

<p>An engineering team with clarity about why they are doing what they’re doing will make better implementation decisions that align with the “why.” When an engineering team’s purpose is clear, they understand what they are working on, why they are working on it, and how their work will have an impact. By making sure your team is clear on their purpose, you’re empowering them.</p>]]></content><author><name>Samuel</name></author><category term="EngineeringManagement" /><category term="leadership" /><category term="empowerment" /><category term="autonomy" /><category term="purpose" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37afe623-fe1c-4a38-97a1-2fb88f9b5566_1644x644.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37afe623-fe1c-4a38-97a1-2fb88f9b5566_1644x644.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Prioritising effectively and getting work done as managers</title><link href="https://hubofco.de/engineeringmanagement/2023/06/25/prioritising-effectively-and-getting-work-done-as-managers/" rel="alternate" type="text/html" title="Prioritising effectively and getting work done as managers" /><published>2023-06-25T12:01:00+00:00</published><updated>2023-06-25T12:01:00+00:00</updated><id>https://hubofco.de/engineeringmanagement/2023/06/25/prioritising-effectively-and-getting-work-done-as-managers</id><content type="html" xml:base="https://hubofco.de/engineeringmanagement/2023/06/25/prioritising-effectively-and-getting-work-done-as-managers/"><![CDATA[<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8911ec-df2f-4212-b49c-165524d35d2c_1600x670.png" alt="" /></p>

<p>You’re overwhelmed. You have an endless list of different tasks. Your schedule is stuffed with meetings and you need more time for focused work. Each meeting you attend brings more to-do items to your unending list. You’re busy throughout the day but can hardly point to something concrete you have done yourself.</p>

<p>If this describes you, you’re likely in a lead role or managing teams. This post is for you.</p>

<p>If you recently transitioned to a lead or manager role, you may start to feel like you’re not doing your job. That feeling of achievement you once derived from shipping tangible stuff disappears. At this same junction, some leaders slip back to doing IC work, like churning out features while ignoring the more significant part of their role –leading.</p>

<p>There are two things you need to understand as a new lead. You need a new approach to frame what you do. And you need a new way to prioritise them effectively.</p>

<p>As a software lead or manager:</p>

<ul>
  <li>
    <p>You will mostly be working through people (your team).</p>
  </li>
  <li>
    <p>You will do small little tasks but many instead of a whole big chunk of work.</p>
  </li>
  <li>
    <p>You will frequently context-switch throughout the day.</p>
  </li>
  <li>
    <p>The impact of some of the work you will do will be long-term, which will make it hard to get that instant feeling of progress. The effect of that extra email you sent to finally close a strong talent will not materialise instantly.</p>
  </li>
</ul>

<p>All of the above run counter-parallel to what you did as an individual contributor. Hence you will need to change how you view your work. Rather than what you contribute single-handedly, you need to reframe the way you see the organisation and your contribution to its success.</p>

<p>Your role is no longer about your individual contribution and your individual impact.</p>

<p>Your role is about what the team does, about how the team contributes to the company’s success. Those little boring tasks you do to make your team better and improve their efficiency now matter more than the features you can code alone – that is your job.</p>

<p><strong>But you’re still drowning in tasks. So, how do you stay on top of your not-ending to-do list and still get things done?</strong></p>

<p>The answer is simple: know how to pick your battles. You need to understand tasks you should do and tasks that others should do. You need to know what should be done in the moment and what needs to wait a little bit longer.</p>

<p>One of the mistakes I made in the past was making every task that came my way a task I must do. While a leader may be responsible, it does not mean that the leader should do everything. I was clearly missing “What I should do vs. what others should do.” I was equating being responsible to being the one doing it.</p>

<p>Being responsible for a task does not mean you should do it. There will be hundreds of things demanding your attention because the scope of your work has increased. It’s important to constantly differentiate what you should actually be doing from what others should be doing.</p>

<p>Once you find out what you should do vs. what others should do, you can delegate what others should do and focus on what you should do.</p>

<h2 id="how-to-know-what-you-should-do-vs-what-others-should-do">How to know what you should do vs. what others should do</h2>

<p>A delegation matrix is an effective tool you can use to decide which tasks to delegate and which to retain. When you use this tool effectively, you’ll be able to free up some time to chase bigger things. A delegation matrix is a 2×2 table measuring the competence of reports on the y-axis and the competence of a manager on the x-axis.</p>

<p><a href="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8911ec-df2f-4212-b49c-165524d35d2c_1600x670.png"></a></p>

<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8911ec-df2f-4212-b49c-165524d35d2c_1600x670.png" alt="" /></p>

<p>Let’s look at a case where a delegation matrix can be applied. You have just received a request to review a new design document from another product team to ensure approaches are aligned. There are a few options you have.</p>

<ul>
  <li>
    <p>Spring into action to review and unblock the team while other things suffer.</p>
  </li>
  <li>
    <p>Add it to your overflowing to-do list and block this team for a couple of days or a week until some time is freed up for the review.</p>
  </li>
</ul>

<p>Using a delegation matrix, you may assess that a report on your team possesses the skills to review the design doc. Clearly, both you and your report can do the review. This task goes into quadrant 3 in the image above. Everything in quadrants 1 and 3 should be delegated to your reports.</p>

<p>There will be times when you have a task neither you or your report can do. Such tasks should either be delegated to your manager or solicit assistance from your manager to do it.</p>

<p>Everything that falls into Quadrant 4 should be done by you.</p>

<p><strong>Tasks in quadrants 1 and 3 are:</strong></p>

<ol>
  <li>
    <p>Tasks that can be handled adequately by reports</p>
  </li>
  <li>
    <p>Tasks for which team members have all the information for decision-making.</p>
  </li>
  <li>
    <p>Tasks that don’t require skills unique to you or your position.</p>
  </li>
  <li>
    <p>Tasks for which an individual other than you can have direct control over the task.</p>
  </li>
  <li>
    <p>Tasks and/or projects that will contribute to the growth and development of the individual.</p>
  </li>
</ol>

<p><strong>Tasks that fall in Quadrant 4</strong></p>

<ul>
  <li>
    <p>The delegation process itself: Any work to be delegated should be delegated and explained by you.</p>
  </li>
  <li>
    <p>Performance evaluations and disciplinary actions: These are managerial responsibilities.</p>
  </li>
  <li>
    <p>Coaching, and moral problems.</p>
  </li>
  <li>
    <p>Planning and forecasting: Some of the detailed work can be done by engineers, such as breaking tasks down and estimations. However, you alone are in a position to decide team goals how they fit in with the overall company’s goals.</p>
  </li>
  <li>
    <p>Confidential tasks.</p>
  </li>
</ul>

<h2 id="separate-urgent-important-tasks-from-non-urgent-tasks">Separate urgent, important tasks from non-urgent tasks</h2>

<p>Have you categorised your tasks into what you should do vs. what others should do and delegated what others should do, and yet you still feel like you’re drowning in assignments? Then, the next step is to separate urgent and important tasks from non-urgent and important tasks.</p>

<p><strong>Do urgent and important tasks now.</strong></p>

<p>Urgent and important tasks have clear deadlines and consequences if immediate action is not taken. Urgent tasks include responding to incidents, onboarding a new engineer, resolving conflicts, and unblocking a report.</p>

<p><strong>Schedule important tasks that are not urgent</strong>.</p>

<p>Important but not urgent tasks are tasks without a specific deadline but they bring you closer to your goals. These are tasks that are easy to procrastinate on. Examples are defining your team’s strategy and vision. Schedule these tasks for the future. Put a block in your calendar in the future to do them.</p>

<p><strong>Delegate urgent but unimportant tasks</strong>.</p>

<p>You may still find a couple of urgent but unimportant tasks in your todo list. There are <a href="https://www.researchgate.net/publication/327103488_The_Mere_Urgency_Effect">studies</a> that show that we’re naturally biased towards picking urgent tasks even if they’re less important. Tasks that are urgent but are not important are best delegated.</p>

<p>To conclude, if you’re going to take one thing from this post it should be being responsible doesn’t mean doing it yourself. Understanding what you should do vs. what others should do is key to staying on top of your schedule.</p>

<h2 id="additional-resources">Additional resources</h2>

<p><a href="https://leaddev.com/professional-development/five-valuable-lessons-new-tech-lead">Five valuable lessons for a new tech lead</a><br />
<a href="https://www.theengineeringmanager.com/management-101/feeling-productive/">Feeling productive</a></p>]]></content><author><name>Samuel James</name></author><category term="EngineeringManagement" /><category term="prioritisation" /><category term="software leads" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8911ec-df2f-4212-b49c-165524d35d2c_1600x670.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc8911ec-df2f-4212-b49c-165524d35d2c_1600x670.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Three Ways to Gain Visibility into Indirectly Managed Teams</title><link href="https://hubofco.de/engineeringmanagement/2023/06/10/three-ways-to-gain-visibility-into-indirectly-managed-teams/" rel="alternate" type="text/html" title="Three Ways to Gain Visibility into Indirectly Managed Teams" /><published>2023-06-10T08:11:00+00:00</published><updated>2023-06-10T08:11:00+00:00</updated><id>https://hubofco.de/engineeringmanagement/2023/06/10/three-ways-to-gain-visibility-into-indirectly-managed-teams</id><content type="html" xml:base="https://hubofco.de/engineeringmanagement/2023/06/10/three-ways-to-gain-visibility-into-indirectly-managed-teams/"><![CDATA[<p>How do you foster execution, remove roadblocks from frontline teams?</p>

<p><a href="https://res.cloudinary.com/samueljames/image/upload/c_fit,w_400/v1686385258/https_3A_2F_2Fsubstack-post-media.s3.amazonaws.com_2Fpublic_2Fimages_2Ff4a5a817-0793-4f87-ab70-9e3a2fb81980_1164x983.png"></a></p>

<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4a5a817-0793-4f87-ab70-9e3a2fb81980_1164x983.png" alt="" /></p>

<p>Will Larson provided <a href="https://lethain.com/mail-bag-resources-for-engineering-directors/">three ways</a> to do this that I have found useful.</p>

<ul>
  <li>
    <p>First, you need to understand what’s happening on indirectly managed teams below you.</p>
  </li>
  <li>
    <p>Second, add things necessary for execution.</p>
  </li>
  <li>
    <p>Third, remove things getting in the way of execution.</p>
  </li>
</ul>

<p>The biggest puzzle to solve is understanding what is happening in the indirectly managed teams.</p>

<p>As one goes up in an organisation, the less the details you know about day-to-day jobs of people below and the unique challenges they face. An engineering director will have more knowledge about how each piece of work connects with teams under her purview but less knowledge about the codebase or the nitty-gritty of the environment frontline engineers work in and the impact it has on execution.</p>

<p>To draw an analogy, it’s like zooming in and out on maps. In a zoomed out mode, you see the big picture. When you’re in a zoomed in mode, you get a close look at select details. Frontline engineers have a zoomed in experience. And the higher you go in an organisation, the more zoomed out your view becomes. Folks at the top of an organisation tend to see and experience things in a broader sense than those who are layers away under them who may see things narrower but in-depth.</p>

<p>There are a few areas engineering leaders can look at to have visibility required to foster execution and remove roadblocks from frontline teams. That starts from finding a way to get quantitative and qualitative data that will help you:</p>

<ul>
  <li>
    <p>Understand how your teams are building what they’re building on time and within the budget you have.</p>
  </li>
  <li>
    <p>Understand what your teams have built are running at the desired level of performance and reliability and that it will continue to do so within the foreseeable future.</p>
  </li>
  <li>
    <p>Understand if the teams building that stuff are engaged and happy to continue building what they’re building.</p>
  </li>
  <li>
    <p>Understand if the users you’re building for are getting value and deriving satisfaction from what you’re building.</p>
  </li>
</ul>

<h2 id="understanding-how-your-teams-are-building-what-theyre-building">Understanding how your teams are building what they’re building.</h2>

<p>Understanding how your teams build requires you to understand the process your teams take to build that thing they’re building. Let’s call this “process“. Process is the way your teams habitually do things. It is how your teams get work done on time and within budget. If you’re a Pizza company, process is how you turn dough into edible Pizzas delivered to consumers.</p>

<p>For a software company, your process is ideally your approach to building what you’re building.</p>

<p>If your organisation is like most organisations, your time and budget are finite. You always have to get value delivered to users within a certain budget and within a specific time. Due to this, you have to find an optimum way to optimise value delivery to users within the budget that you have.</p>

<p>To gain an understanding of how your team builds stuff and the possible roadblocks, take a look at where engineers spend their time, things that affect execution, how they collaborate, the tooling you have that enables them, the codebase, systems and architectures they work in.</p>

<ul>
  <li>
    <p>Does architecture/codebase allow teams to move as fast as possible?</p>
  </li>
  <li>
    <p>Do your teams have the right tooling?</p>
  </li>
  <li>
    <p>Do they have autonomy to spin up new services, build, scale and deploy them without hand-offs? How long does it take them to do this?</p>
  </li>
  <li>
    <p>Are your teams empowered to deliver values independently, own their backlog, ideate and figure out the most important thing to build?</p>
  </li>
  <li>
    <p>How do different functions within the teams work together? Do they have a shared understanding of what’s being built?</p>
  </li>
  <li>
    <p>Does everyone know where they fit in and what roles they need to play in what is being built.</p>
  </li>
  <li>
    <p>Do they have necessary tools</p>
  </li>
</ul>

<p>Answers to questions like these will point you in the right direction in figuring out what may be standing in the way of execution.</p>

<h2 id="3-ways-to-have-the-visibility-required">3 Ways to have the visibility required</h2>

<p>You want to gain a better understanding of unique challenges your indirectly managed teams face? Include the following in your data gathering process.</p>

<h3 id="1-run-developers-experience-surveys-quarterly">1. Run developers experience surveys quarterly</h3>

<p>Running surveys is one way to understand the experience your engineers have. Engineers’ development experience can reveal a ton about how they’re building what they’re building as well as what is standing in the way of getting things done.</p>

<p>I have seen engineering surveys centred on tools and technology only. Surveys like that often fall short. Sometimes the biggest source of roadblocks could be unrelated to tech or tools. A process or a social contract that has nothing to do with tech can be equally a source of friction for engineers.</p>

<p>I like the way C J Silverio puts it in this <a href="https://blog.ceejbot.com/posts/reduce-friction/">post</a>.</p>

<blockquote>
  <p><em>If people are regularly doing any end-run around a process to get work done (say, regularly asking for rubber-stamp PRs so they can be unblocked), you have a process that’s not earning back its energy cost. Fix it.</em></p>
</blockquote>

<p>A simple survey with questions like the ones below could reveal a ton about your engineering processes:</p>

<ul>
  <li>
    <p>How easy is it to find documentation and access documentation?</p>
  </li>
  <li>
    <p>What manual work can be automated that the team does?</p>
  </li>
  <li>
    <p>What process do you find valuable?</p>
  </li>
  <li>
    <p>What process would you improve?</p>
  </li>
  <li>
    <p>How satisfied are you with the tooling used daily e.g linters, IDE?</p>
  </li>
  <li>
    <p>What would you add or improve in our tooling?</p>
  </li>
</ul>

<p>If you’re looking for where to start, <a href="https://docs.google.com/spreadsheets/d/1gGKtZ78sKbTzxQTydcZGEB5HiLeXsHmWNqpaTL6ikQU/edit#gid=0">developer experience survey questions</a> by Laura Tacho is a good one.</p>

<h3 id="2-hold-1-on-1s-and-skip-level-1-on-1s"><strong>2. Hold 1-on-1s and skip-level 1-on-1s</strong></h3>

<p>When 1-on-1 is used effectively, it can be a platform for gathering information that allows you to know what’s going on in the team and with each engineer. 1-on-1 is a great time to offer your perspectives on what is going well and not going well. But it is also a good time to gather qualitative data, disproof or proof what you already know that is not going well.</p>

<p>If you’re leading from 2 or many layers away, hold skip-level 1-on-1s with frontline engineers.</p>

<ul>
  <li>
    <p>It will help you to gather information about how your managers are really doing beyond what they tell you.</p>
  </li>
  <li>
    <p>It will help you get a pulse on what’s happening on the front lines.</p>
  </li>
  <li>
    <p>It will help you learn where there is dysfunction, insufficient communication, or confusion within parts of your organisation.</p>
  </li>
</ul>

<h3 id="3-track-dora-metrics">3. Track DORA metrics</h3>

<p>DORA metrics came to be after six years worth of surveys conducted by the DORA team and identified four metrics that elite-performing software teams use to measure their performance.</p>

<ul>
  <li>
    <p><strong>Deployment Frequency</strong>—How often do you release code to production? How often do you deliver value to users?</p>
  </li>
  <li>
    <p><strong>Lead Time for Changes</strong>— The amount of time it takes a commit to get into production. How long does it take to deliver value to users?</p>
  </li>
  <li>
    <p><strong>Change Failure Rate</strong>— What is the percentage of this value that is defective?</p>
  </li>
  <li>
    <p><strong>Time to Restore Service</strong>—How fast can we get back up when we fail? How long does it take an organisation to recover from a failure in production?</p>
  </li>
</ul>

<p>Measuring productivity metrics is a hard topic in software engineering because companies have attempted to measure developers’ productivity but ended up measuring the wrong thing. A few companies learned from this and abstained totally from ever measuring anything.<br />
One thing is certain: developers’ productivity can not be reduced to a single metric or dimension. There are multiple dimensions to it.</p>

<p>Personally, I find DORA metrics effective because they can show you trends that allow you to ask crucial questions. When a deployment frequency trends downward, it puts a lot of questions you can ask your team.</p>

<p>One of my experiences with using DORA started at <a href="https://www.tier.app/en/">TIER</a>. We were tracking how well we were doing WoW and MoM. Once every two weeks, I would sit down with Engineering Leads to discuss the metrics.</p>

<p>When we saw deployment frequency trending downwards or upwards, I would ask questions to understand what was going on. When we got better at putting code to production frequently, I wanted to know what changes we made.</p>

<p>In one of the review meetings with engineering leads, deployment frequency was trending downward. As I sat down with them to understand the root cause and how I could better support them, we saw that there were issues with our deployment infrastructure. The infrastructure team made changes to the base infrastructure that this team was yet to update to.</p>

<p>Most of the engineers in this team were new and had missed out on comms that had gone out earlier on why the team should upgrade. On finding the root cause, we were able to take a more actionable step, work together with the infrastructure team and get supports needed.</p>

<h4 id="additional-resources">Additional resources</h4>

<p>If you’re looking for more inputs? The following links are great resources.</p>

<ul>
  <li>
    <p><a href="https://blog.ceejbot.com/posts/reduce-friction/">Reduce friction</a></p>
  </li>
  <li>
    <p><a href="https://getlighthouse.com/blog/skip-level-meetings-one-on-ones/">Skip Level Meetings: Everything You Need To Know About Skip Level 1 On 1s</a></p>
  </li>
  <li>
    <p><a href="https://speakerdeck.com/abiodunjames/how-engineering-managers-can-lead-with-visibility">How engineering leaders can lead with visibility</a></p>
  </li>
</ul>]]></content><author><name>Samuel James</name></author><category term="EngineeringManagement" /><category term="visibility" /><summary type="html"><![CDATA[How do you foster execution, remove roadblocks from frontline teams?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4a5a817-0793-4f87-ab70-9e3a2fb81980_1164x983.png" /><media:content medium="image" url="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4a5a817-0793-4f87-ab70-9e3a2fb81980_1164x983.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to build a high-performing software team</title><link href="https://hubofco.de/2023/01/28/how-to-build-a-high-performing-core-team/" rel="alternate" type="text/html" title="How to build a high-performing software team" /><published>2023-01-28T12:00:00+00:00</published><updated>2023-01-28T12:00:00+00:00</updated><id>https://hubofco.de/2023/01/28/how-to-build-a-high-performing-core-team</id><content type="html" xml:base="https://hubofco.de/2023/01/28/how-to-build-a-high-performing-core-team/"><![CDATA[<p><img src="https://res.cloudinary.com/samueljames/image/upload/v1674910958/aligned_vs_mis-aligned_team.png" alt="aligned team" /></p>

<p><strong>How do I build a high-performing team?</strong></p>

<p>This is one of the common questions managers ask themselves. No one wants a low-performing team. We all want to build effective and high-performing teams – a team that is highly interdependent, bound with a common goal, plans work, makes decisions, solves problems and delivers superior results.</p>

<p>But you can’t just pull a bunch of people together in a room and expect outstanding performance.</p>

<p>Building a high-performing team requires deliberate efforts. You have to create an environment that allows and breed outstanding performance.</p>

<p>Over the years, I have experienced underperforming and high-performing teams. I saw the kind of environment where high-performing teams thrive, things high-performing teams do differently and the culture that enables them.</p>

<p>But how do you create a high-performing team and what levers are there to pull?</p>

<p>There’s more than one way to skin a cat. But certainly, there are a few levers you can pull that will put your team on the way to superior performance.</p>

<h2 id="hire-strong-talents-strong-talents-make-a-strong-team">Hire strong talents. Strong talents make a strong team.</h2>

<p>You want to build a strong-performing team? Hire strong people. Why hire a strong senior when you can have 2 medior for the exact cost? Well, superior performance doesn’t come cheap.</p>

<p>Strong engineers are high achievers, trail blazers and catalysts for improvements. It’s worth the cost.</p>

<p>If you want to build a strong software team, hire strong people in product, data, design and engineering. Hire people who constantly challenge themselves and work to better or improve themselves.</p>

<p>Hiring a group of people and expecting outstanding performance doesn’t work. There have to be some thoughts behind the group composition and why they exist.</p>

<p>Usually, we often fail in two ways. Suppose we don’t fail at composing the right team by carefully hiring folks that bring complementary skills, we fail at defining why the team exists and ensuring every team member understands why they exist and why they’re doing what they’re doing.</p>

<h2 id="empower-them">Empower them</h2>

<p>Have you hired a strong team? Coach, empower and get out of the way as necessary. A high-performing team is an empowered team that makes decisions within a defined boundary. They are able to make effective micro-decisions in their day-to-day work.</p>

<p>An engineering team that relies on a single decision-maker, albeit a product manager or an engineering manager is not empowered. Progress stalls when a team depends on a single person to make decisions.</p>

<p>Coach your team to the point that everyone knows what the expected outcome is. Coach them to a point where they know what they should be doing and what’s not.</p>

<p>An environment where one person is solely responsible for figuring out what to build, which is then handed over to the team creates a situation where a team fails to see why they’re doing what they’re doing. It becomes a game of “they ask us, we do it”.</p>

<h2 id="create-synergy">Create synergy</h2>

<p>Synergy is when two or more people work together to produce something of value. The more people you have, the more difficult it’s to create synergy – this is one of the reasons why the concept of <a href="https://docs.aws.amazon.com/whitepapers/latest/introduction-devops-aws/two-pizza-teams.html">two pizzas</a> garnered popularity.</p>

<p>Synergy, at its core, is about team working, connecting and collaborating effectively together.</p>

<p>Communication and collaboration are key to maintaining high synergy. Folks having to chase information they need to do their work drains motivation and depletes trust between teams.</p>

<p>When communication breaks down, synergy suffers. And it could feel like every other person is out there to prevent you from doing your job. Often, it’s just a misalignment between functions, as rightly said in this <a href="https://medium.com/big-on-development/how-to-address-tech-debt-vs-product-debt-vs-business-debt-42d43d5f768c">post</a>:</p>

<blockquote>
  <p>Every team in an org has its own pain points as well as its objective. While it seems that one department refuses to fix the problem of the other, it is also true that most people in the company work with good intentions. Hence, if everyone is trying to do a good job and they also complain about each other, the problem therefore lies in communication (or more specifically, communication breakdown).</p>
</blockquote>

<p>If you want to build a high-performing team, pay attention to how different functions collaborate and the willingness of people to share information that benefits the wider group.</p>

<p>One of the best ways I have seen this play out is having a “default open” culture where ideas, information and feedback flow freely.</p>

<h2 id="create-a-culture-of-learning-and-improvements">Create a culture of learning and improvements</h2>

<p>Invented by psychologist Bruce Tuckman, a team goes through <a href="https://www.sixsigmadaily.com/what-is-forming-storming-norming-performing/#:~:text=The%20concept%20of%20Forming%2C%20Storming,on%20accomplishing%20a%20shared%20goal.">four stages of psychological development</a> known as forming, storming, norming and performing. For a team to get to the performing stage, there has to be a lot of mistakes and learnings along the way.</p>

<p>A high-performing team is not formed in a day. High-performing teams go through periods when they’re low performing. It’s through mistakes and learning from those mistakes that make the team become better and better as days go by.</p>

<p>A team that reflects on how to become more effective and then tunes and adjusts its way of working gets better as time passes.</p>

<p>As Jocko and Leif stated in their book titled the dichotomy of leadership:</p>

<blockquote>
  <p>No team can deliver flawless performance. No one can achieve perfection. What makes the best team great is that when they make mistakes, they acknowledge them, take ownership and make corrections to upgrade their performance. With each iteration, they enhance its performance.</p>
</blockquote>

<p>High-performing teams make incremental changes in processes (way of working) to improve quality and efficiency. They have a mindset that whatever is good today might not be good tomorrow. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.</p>

<p>To build a culture of learning and improvement, you have to create an environment where people feel it’s okay to fail and not be afraid to voice out their shortcomings.</p>

<p>This is where psychological safety plays a crucial role.</p>

<h2 id="build-psychology-safety">Build psychology safety</h2>

<p>Psychology safety is the belief that you have the freedom to speak your mind without being punished for that. Such belief is rooted in deep trust between peers and leadership. In a software team with high psychological safety, members are more open to taking risks, trying new ways and not afraid to fail. It’s this tolerance for mistakes and risks that allows the team to be more innovative, develop muscle to try new things, fail quickly and learn.</p>

<p>As a leader, it’s your responsibility to build an environment that doesn’t make people feel they can’t freely share their opinions but also an environment that lets them feel their opinions are welcome and you love to hear them.</p>

<hr />

<p>In summary, building high-performing teams will take time to happen. It’s through deliberate effort but it’s worth it. To build high-performing teams, hire strong talents, empower them, create synergy and build a culture of learning and improvement.</p>

<iframe src="https://softwareleads.substack.com/embed" width="100%" height="320" style="border:0px solid #EEE; background:white;" frameborder="0" scrolling="no"></iframe>]]></content><author><name>James Samuel</name></author><category term="management" /><category term="softwareleads" /><category term="high-performing" /><category term="leadership" /><summary type="html"><![CDATA[But you can't just pull a bunch of people together in a room and expect outstanding performance. To build high-performing teams, hire strong talents, empower them, create synergy and build a culture of learning and improvement.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://res.cloudinary.com/samueljames/image/upload/v1674910958/aligned_vs_mis-aligned_team.png" /><media:content medium="image" url="https://res.cloudinary.com/samueljames/image/upload/v1674910958/aligned_vs_mis-aligned_team.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How managers can lead with visibility – Part 1</title><link href="https://hubofco.de/softwareengineering/engineeringmanagement/2022/12/10/how-managers-can-lead-with-visibility-part-1/" rel="alternate" type="text/html" title="How managers can lead with visibility – Part 1" /><published>2022-12-10T10:09:00+00:00</published><updated>2022-12-10T10:09:00+00:00</updated><id>https://hubofco.de/softwareengineering/engineeringmanagement/2022/12/10/how-managers-can-lead-with-visibility-part-1</id><content type="html" xml:base="https://hubofco.de/softwareengineering/engineeringmanagement/2022/12/10/how-managers-can-lead-with-visibility-part-1/"><![CDATA[<p>Managers’ roles and responsibilities come in different forms and shapes, but one thing is clear: if you’re a manager, you’re there to get better outcomes from the people you work with. To be able to get better outcomes from the people you lead, you need to be able to understand what is happening in that team. To be better put, you need to have visibility into that team.</p>

<blockquote>
  <p>You were not in control. You had no visibility: maybe there was a car in front of you, maybe not.
 –Alain Prost</p>
</blockquote>

<p>As tech leaders, external factors, culture, and engineers’ personal life can affect delivery flow or your team’s ability to deliver on business outcomes and goals. You will need to equip yourself with data that allows you to get faster feedback to stay on top of these factors and maintain predictable and consistent delivery of value to users.</p>

<p>Whether you’re managing from 1 or 2 layers away, there are four main areas you need to get both quantitative and qualitative data from to have the visibility you need to lead your team effectively. I categorize these four areas you need to get data from as process, operation, people and product.</p>

<p>You need to get both quantitative and qualitative data that will help you:</p>

<ul>
  <li>
    <p>Understand how your team is building what they’re building on time and within the budget you have – <strong>Process</strong></p>
  </li>
  <li>
    <p>Understand if what has been built is running at the desired level of performance and reliability and that it will continue to do so within the foreseeable future – <strong>Operation</strong>.</p>
  </li>
  <li>
    <p>Understand if the folks building that stuff are engaged and happy to continue building it – <strong>People</strong>.</p>
  </li>
  <li>
    <p>Understand if the folks (users) you’re building for are getting value and deriving satisfaction from what you’re building – <strong>Product</strong></p>
  </li>
</ul>

<p><img src="https://hubofco.de/uploads/Pillars.jpg" alt="Framework for leading with visibility" /></p>

<h1 id="process--understanding-how-your-team-is-building-what-theyre-building">Process – Understanding how your team is building what they’re building</h1>

<p>Process is the way you habitually do things. It is how your team gets work done on time and within budget. If you’re a Pizza company, process is how you turn dough into edible Pizzas delivered to consumers. As software company, your process is ideally your approach to building what you’re building.</p>

<p>If your organization is like most organizations, your time and budget is finite. I’m yet to see an organization that has enough of both. We always have to get value delivered within a certain budget and within a specific time – this calls for a balancing act.</p>

<p>One question most managers have is how their teams are building what they’re building within that time and the budget they have. 
To answer this question, you need visibility into where engineers spend their time and things that affect execution, like how collaboration works across different functions, the tooling you have that enables your team, the codebase, systems and architectures your engineers work in.</p>

<ul>
  <li>
    <p>Does architecture/codebase allow teams to move as fast as possible?</p>
  </li>
  <li>
    <p>Does your team have the right tooling?</p>
  </li>
  <li>
    <p>Do they have autonomy to start new services, build, scale and deploy them independently?</p>
  </li>
  <li>
    <p>Are your teams empowered to deliver values independently, own their backlog, ideate and figure out the most important thing to build?</p>
  </li>
  <li>
    <p>How do different functions within the team work together? Do they have a shared understanding of what’s being built? Does everyone know where they fit in and what roles they need to play in what is being built.</p>
  </li>
</ul>

<p>Answers to questions like these will point you in the right direction in figuring this out.</p>

<h2 id="3-ways-to-have-the-visibility-required">3 Ways to have the visibility required</h2>

<p>Here are ways engineering leaders can have the visibility required.</p>

<h3 id="developers-experienceproductivity-survey">Developers Experience/Productivity Survey</h3>

<p>One of the ways to understand how your team is building what they’re building is through regular engineering surveys to understand engineers’ experience. What engineers experience while developing can reveal a ton about how they’re building what they’re building and where bottlenecks of pain points exist.</p>

<p>If your organization has a way of measuring employee engagement, you should still run surveys centered on developer experience. Organizational-wide engagement surveys are sometimes too general and may not effectively capture engineering experience.</p>

<p>A simple survey with questions like this could reveal a ton about engineering processes:</p>

<ul>
  <li>
    <p>How easy is it to find documentation and access documentation?</p>
  </li>
  <li>
    <p>What manual work can be automated that the team does?</p>
  </li>
  <li>
    <p>What process do you find valuable?</p>
  </li>
  <li>
    <p>What process would you improve?</p>
  </li>
  <li>
    <p>How satisfied are you with the tooling used daily e.g linters, IDE?</p>
  </li>
  <li>
    <p>What would you add or improve in our tooling?</p>
  </li>
</ul>

<p>When we ran some of these surveys at <a href="https://www.tier.app/en/">TIER</a>, we realized that documentation and tooling were issues that needed to be addressed. We started investing in making documentation accessible and consolidating them. Along the way, we had to solve a challenge of making documentation close to the code base where engineers work.</p>

<p>If you’re looking for where to start, <a href="https://docs.google.com/spreadsheets/d/1gGKtZ78sKbTzxQTydcZGEB5HiLeXsHmWNqpaTL6ikQU/edit#gid=0">developer experience survey questions</a> by Laura Tacho is a good one.</p>

<h3 id="1-on-1s">1-on-1s</h3>

<p>When 1-on-1 is used effectively, it can be a platform for information that allows you to know what’s going on in the team and with each engineer individually.</p>

<p>When I first started having 1-on-1s, one thing I wanted to avoid was making it a meeting to get project updates and progress reports. In my first meeting with a report, I would set expectations of what the meeting was and was not about. One thing I always made clear was to refrain from talking about projects or getting project updates in the meeting.</p>

<p>With time, I realized that talking about projects was unavoidable to support my team effectively. As I listened to concerns from my teams, they always pointed back to the project they were working on.</p>

<p>If a report has an issue with another engineer on a team, it’s due to a project or task they’re working on. If a report is not getting what she wants from another team, it’s due to a task or project she’s working on.</p>

<p>So I realized it’s okay to talk about projects in 1-on-1s if there is an opportunity to unblock or coach a report.</p>

<p>You should use 1-on-1s to pick signals of things going well or not going well in your team. This is why 1-on-1s can be effective in helping you understand how your team is building what they’re building.</p>

<p>Here is an example of a conversation with a report.</p>

<p><strong>*Manager</strong>: What do you think about how project XYZ is going?</p>

<p><strong>Report</strong>: Well, it’s going fine except that we underestimated the project. There was ABC we didn’t think about or account for when we committed to the project. To do XYZ, we need to do ABC which will take 3 more weeks. In fact we don’t even know if it can be done yet. We’re currently undertaking tech discovery on how to do ABC.*</p>

<p>From the above conversation, this team is probably not investing in proper product and technical discovery before taking on projects if it becomes a pattern.</p>

<h3 id="dora-metrics">DORA Metrics</h3>

<p>Measuring productivity metrics is a hard topic in software engineering. Historically, companies have attempted to measure developers’ productivity but ended up measuring the wrong thing. One thing is certain: developers’ productivity can not be reduced to a single metric or dimension. There are multiple dimensions to it.</p>

<p>One approach that has worked for me in the past is by measuring DORA metrics.</p>

<p>DORA metrics came to be after six years worth of surveys conducted by the DORA team and identified four metrics that elite-performing software teams use to measure their performance.</p>

<ul>
  <li>
    <p><strong>Deployment Frequency</strong>—How often do you release code to production? How often do you deliver values to users?</p>
  </li>
  <li>
    <p><strong>Lead Time for Changes</strong>— The amount of time it takes a commit to get into production. How long does it take to deliver value to users?</p>
  </li>
  <li>
    <p><strong>Change Failure Rate</strong>— What is the percentage of this value that is defective?</p>
  </li>
  <li>
    <p><strong>Time to Restore Service</strong>—How fast can we get back up when we fail? How long does it take an organization to recover from a failure in production?</p>
  </li>
</ul>

<p>I have found DORA metrics effective because they can show you trends that allow you to ask crucial questions. When a deployment frequency trends downward, it puts a lot of questions you can ask your team.</p>

<p>My teams at <a href="https://www.tier.app/en/">TIER</a> adopted DORA metrics. We were tracking how well we were doing WoW and MoM. Once every two weeks, I would sit down with Engineering Leads to discuss the metrics.</p>

<p>When we saw deployment frequency trending downwards or upwards, I would ask questions to understand what was going on. When we got better at putting code to production frequently, I also wanted to know what changes we made.</p>

<p>In one of the review meetings with engineering leads, deployment frequency was trending downward. As I sat down with the them to understand the root cause and how I could better support them, we saw that there were issues with our deployment infrastructure. The infrastructure team made changes to the base infrastructure that this team was yet to update to.</p>

<p>Most of the engineers in this team were new and had missed out on comms that had gone out earlier on why the team should upgrade. On finding the root cause, we were able to take a more actionable step, work together with the infrastructure team and get the support needed.</p>]]></content><author><name>Samuel James</name></author><category term="SoftwareEngineering" /><category term="EngineeringManagement" /><category term="leadership" /><category term="engineeringmanager" /><category term="management" /><category term="visibility" /><summary type="html"><![CDATA[Managers’ roles and responsibilities come in different forms and shapes, but one thing is clear: if you’re a manager, you’re there to get better outcomes from the people you work with. To be able to get better outcomes from the people you lead, you need to be able to understand what is happening in that team. To be better put, you need to have visibility into that team.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to Onboard Engineers as a Hiring Manager</title><link href="https://hubofco.de/onboarding/2022/04/09/how-to-onboard-engineers-as-a-hiring-manager/" rel="alternate" type="text/html" title="How to Onboard Engineers as a Hiring Manager" /><published>2022-04-09T09:37:00+00:00</published><updated>2022-04-09T09:37:00+00:00</updated><id>https://hubofco.de/onboarding/2022/04/09/how-to-onboard-engineers-as-a-hiring-manager</id><content type="html" xml:base="https://hubofco.de/onboarding/2022/04/09/how-to-onboard-engineers-as-a-hiring-manager/"><![CDATA[<p>While every organization has its own way of onboarding engineers, there are few elements that will make any engineering onboarding effective. In this post, I share thoughts I have adopted that govern how I onboard engineers into my teams.</p>

<p>Engineering onboarding entails communicating your company’s expectations, such as technical, process, product, cultural, and professional, to new hires in a way that makes the knowledge useful and practical.</p>

<p>The time it takes for a new engineer to be a productive member of a team is a function of how effective your onboarding process is.</p>

<p>I do realize there is a lot that goes into onboarding a new engineer. Usually, there are roles played by Procurement, HR team, and more. Usually, an HR onboarding will contain training programs designed to help new hires get acquainted with a company and its culture. The HR onboarding is usually generalized and may not include engineering onboarding. As a result, the focus of this post will be onboarding from the standpoint of a hiring manager.</p>

<h2 id="start-from-the-hiring-managers-interview">Start from the hiring manager’s interview.</h2>

<p>Engineering onboarding starts from the hiring manager’s interview. This is the first opportunity to help a candidate see her potential impact on the job. Don’t only assess the candidate’s fit during the hiring manager interview but also help the candidate connect to your team’s mission and purpose.</p>

<p>You’re doing onboarding work if you’re able to help a candidate understand what the role is about and how she can possibly fit in. The hiring manager’s interview can serve as an onboarding foundation that will be built on later if the candidate makes it to an offer stage.</p>

<h2 id="use-the-candidates-notice-period-to-your-advantage">Use the candidate’s notice period to your advantage</h2>

<p>Candidates drop out at every stage of the hiring funnel. For a very strong talent, dropping out after signing an offer is a possibility. Some candidates have long notice periods. It’s important to keep the communication going.</p>

<p>You should leverage the notice period to continue to build on the candidate’s excitement about your company, your team, and how you work. My strategies include reaching out to them a week after an offer is signed to share our excitement in a mail.</p>

<p>As the notice period draws to a close, I reach out to see if they need any assistance and make use of the opportunity to send our onboarding guide to the candidate to prepare her ahead.</p>

<h2 id="have-a-comprehensive-written-down-onboarding-guide">Have a comprehensive written down onboarding guide</h2>

<p>One of the few elements of engineering onboarding that I found to be important when onboarding engineers is having a written down onboarding guide. PowerPoint presentations should not replace a written down employee onboarding guide that new hires can refer to from time to time. Nothing scales better than words written down because it’s always available when people aren’t.</p>

<p>Your onboarding guide should help candidates answer the following questions:</p>

<ul>
  <li>
    <p>What’s the team’s culture like?</p>
  </li>
  <li>
    <p>What is the company culture like?</p>
  </li>
  <li>
    <p>How does the team function or work?</p>
  </li>
  <li>
    <p>What are the tools used?</p>
  </li>
  <li>
    <p>How should access to tools be requested?</p>
  </li>
  <li>
    <p>Who are the people to meet?</p>
  </li>
  <li>
    <p>What are the company, product, and tech strategies?</p>
  </li>
  <li>
    <p>What are the engineering principles?</p>
  </li>
  <li>
    <p>Where to find documentation?</p>
  </li>
  <li>
    <p>What is the current team roadmap?</p>
  </li>
  <li>
    <p>Faqs and more.</p>
  </li>
</ul>

<h2 id="dont-leave-engineers-to-figure-out-the-tech-part">Don’t leave engineers to figure out the tech part.</h2>

<p>I had the opportunity to work with six different companies on three continents in the last ten years. Each of these companies has its own unique way of shipping software. They don’t all use the same techniques or approaches to software development.</p>

<p>Regardless of how experienced a new hire is, they should be onboarded into how you build software.</p>

<p>The “She is an engineer, and she will figure it out” will not work.</p>

<p>She needs to get familiar with your workflows and processes, and practices. She needs to know how things get shipped, what metrics are important, e.t.c.</p>

<p>On average, it’s estimated that it takes 3-5 months for a new engineer to be productive. A well-designed tech onboarding will ramp up new hires to a point where they become a productive member of the team.</p>

<blockquote>
  <p>The times it takes a new hire to be productive is a measure of how well your onboarding is.</p>
</blockquote>

<p>An effective onboarding for engineers should not ignore tech. So, ensure your tech onboarding helps new hires understand:</p>

<ul>
  <li>
    <p>How software gets shipped</p>
  </li>
  <li>
    <p>What are the technical expectations from the role</p>
  </li>
  <li>
    <p>What are the non-technical expectations</p>
  </li>
  <li>
    <p>What is the release process like</p>
  </li>
  <li>
    <p>How to set up development environments</p>
  </li>
  <li>
    <p>What tooling is used</p>
  </li>
  <li>
    <p>Your code review process</p>
  </li>
  <li>
    <p>Your reliability metrics</p>
  </li>
  <li>
    <p>Incident management</p>
  </li>
  <li>
    <p>Existing and adopted ADR decisions</p>
  </li>
  <li>
    <p>What is the current state of your tech landscape, and what is the expected state?</p>
  </li>
</ul>

<h2 id="set-onboarding-and-role-expectations">Set onboarding and role expectations.</h2>

<p>Without clear expectations, it’s impossible to drive accountability. Setting clear onboarding expectations helps new hires take responsibility for their own onboarding. It gives them a north star of where they’re headed.</p>

<p>When I onboard engineers, I break down onboarding expectations into 6 categories. Each category has a list of expectations that are required during onboarding.</p>

<ul>
  <li>
    <p>What is expected during the first week of onboarding</p>
  </li>
  <li>
    <p>What is expected from 2- 3 weeks of joining the team</p>
  </li>
  <li>
    <p>What is expected by the end of month 1</p>
  </li>
  <li>
    <p>What is expected by the end of month 3</p>
  </li>
  <li>
    <p>What is expected by month 6 and beyond</p>
  </li>
</ul>

<h2 id="conduct-regular-check-ins">Conduct regular check-ins</h2>

<p>When I joined <a href="https://architrave.de/">Architrave</a>, I relocated from Nigeria to Germany. Fortunately, I had an amazing people team and a fantastic manager who conducted a series of check-ins with me during my first few months.</p>

<p>The check-ins helped me receive early feedback and helped me navigate the change. The check-ins left a positive impression on me even after leaving the company. My onboarding period at Architrave made me realize the importance of frequent check-ins with new hires.</p>

<p>New hires may find it difficult to ask for support and may struggle silently. This makes it harder for managers to catch up and intervene quickly to get them back on track. By having regular check-ins with new hires, you’re creating an opportunity to ask crucial questions about how their onboarding is going. And how you could provide the support they need.</p>

<p>I follow structured check-ins with new hires as follows:</p>

<ol>
  <li>
    <p>Check-in after 2 weeks of joining</p>
  </li>
  <li>
    <p>Check-in after 30 days</p>
  </li>
  <li>
    <p>Check-in after 60 days</p>
  </li>
  <li>
    <p>Check-in after 90 days</p>
  </li>
</ol>

<p>Before going for onboarding check-ins with new hires, it’s important to have some feedback ready.</p>

<ul>
  <li>
    <p>Is the engineer doing the work you hired her to do?</p>
  </li>
  <li>
    <p>What is the new engineer doing well, and what can she change to make working with them easier?</p>
  </li>
  <li>
    <p>How confident are you that the engineer is going to be a top performer six months from now?</p>
  </li>
</ul>

<p>The importance of feedback cannot be overestimated and it is well captured in this <a href="https://leadership.garden/onboarding-engineers/">post</a>.</p>

<blockquote>
  <p><em>New hires will need frequent feedback to feel safe in their onboarding and understand if they are doing things right. This is both important for their self-confidence, lowers their stress levels, and makes sure they are indeed in the right direction.</em></p>
</blockquote>

<h2 id="assign-an-onboarding-buddy">Assign an onboarding buddy</h2>

<p>Assigning an onboarding buddy to a new hire is a good way to provide an informal support system for the new hire.</p>

<p>It’s impossible to cover all new hires’ questions in the onboarding handbook or presentations.</p>

<p>When you assign an onboarding buddy, it allows the new hires to build a fast rapid connection with a co-engineer. And help the new hire get fast support for things not covered in the employee handbook.</p>

<p>Finally, obtain feedback and continuously use the feedback to improve your onboarding process.</p>]]></content><author><name>James Samuel</name></author><category term="onboarding" /><category term="engineering" /><category term="onboarding" /><category term="hiringmanager" /><summary type="html"><![CDATA[While every organization has its own way of onboarding engineers, there are few elements that will make any engineering onboarding effective. In this post, I share thoughts I have adopted that govern how I onboard engineers into my teams.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Five Valuable Lessons for a New Tech Lead</title><link href="https://hubofco.de/leadership/2022/04/03/five-valuable-lessons-for-a-new-tech-lead/" rel="alternate" type="text/html" title="Five Valuable Lessons for a New Tech Lead" /><published>2022-04-03T09:53:00+00:00</published><updated>2022-04-03T09:53:00+00:00</updated><id>https://hubofco.de/leadership/2022/04/03/five-valuable-lessons-for-a-new-tech-lead</id><content type="html" xml:base="https://hubofco.de/leadership/2022/04/03/five-valuable-lessons-for-a-new-tech-lead/"><![CDATA[<p>As you develop your technical skills, you get to a point where you develop incredible confidence about your ability to deliver irrespective of any technical challenge thrown at you.</p>

<p>But the moment you take a leap into a leadership role, moving beyond leading oneself to lead others, you start to realize that succeeding in this new venture takes more than being technically sound. Suddenly, there are non-technical skills that seem to matter.</p>

<p>Unfortunately, most people don’t get a manual handed over to them to help with the transition. In this post, we will look at five valuable lessons for a new tech lead. Whether you’re an individual contributor who is thinking (or in the process) of stepping into a tech leadership role, or a manager who is overseeing such transitions, this article is for you.</p>

<h3 id="1-your-output-is-measured-differently-than-youre-used-to"><strong>1. Your output is measured differently than you’re used to</strong></h3>

<p>Moving to a tech leadership role implies that you’ve done excellently well as an IC. You’ve earned your way to be a rockstar individual contributor, churning out features in the blink of an eye. But in this new role, what made you a good IC is now slightly different from what will make you a good tech lead. Realizing and adjusting to this new reality may feel strange at first. After a few weeks, or months especially, you may start feeling you’re not as productive as you were. This is because as an IC, you derived a sense of accomplishment from purely technical work such as writing code, refactoring an existing codebase, or optimizing an existing feature.</p>

<p>But as you step into your new role, your responsibilities diversify. You now have a responsibility to carve a technical path for your team. Your responsibilities now include having an overview of the entire system your team is building, establishing technical visions, and driving alignment.</p>

<p>Your role has changed, and how your output is (or will be) measured has changed too. It no longer depends on how much you can single-handedly deliver. Instead, it depends partly on what you do and partly on what you enable others (your team) to do. The quicker you’re able to tune your mind to this new reality, the greater your impact will be.</p>

<h3 id="2-aligning-priorities-is-now-part-of-your-job"><strong>2. Aligning priorities is now part of your job</strong></h3>

<p>Aligning priorities means helping your team to focus on what’s most important and impactful. It means making sure your team’s efforts are all in sync with one another, supporting and reinforcing each other on the right thing.</p>

<p>As an IC, you were part of a team; you adopted the team’s goals and priorities. But as you become a tech lead, you start to take an active role in helping to shape your team’s priorities. You start to develop a wider understanding of your team’s goal and use that to decide what’s most urgent and important in terms of collaborating with other stakeholders.</p>

<p>There are always hundreds of tasks to be done, hundreds of features to build, experiments to run, and ideas to test. Being effective in this new journey requires a focus on value and impact and knowing how to choose which results to deliver.</p>

<p>You have to train yourself to look at every project or idea through this lens: how can we use 20% of our efforts to unlock 80% of our business goals, and how can I convince others and get them on board?</p>

<p>There won’t be enough time to build every idea. And there won’t be enough resources to create the ideal, ultra-polished product. The three variables you must now constantly play with are time, resources, and scope:</p>

<p><img src="https://assets.gathercontent.com/NDk3NTA/qB0iDzEKOJigpYxZ?auto=format%2Ccompress&amp;fit=max&amp;h=800&amp;q=75&amp;s=ee3eb908bf9abc174952b3a03f30e73b" alt="" /></p>

<p>Figure 1: The triple constraints</p>

<p><strong>Scope</strong> measures the work to be done. <strong>Resources</strong> include budget, team members, and everything at your disposal to execute the work to be done. <strong>Time</strong> refers to deadlines and how long you have to deliver on the work. You need to learn how to negotiate project scopes and requirements based on available time and resources.</p>

<h3 id="3-how-you-communicate-matters-more"><strong>3. How you communicate matters more</strong></h3>

<p>One of the most important skills to master is communicating effectively. Not all tech leads are good communicators, but all good tech leads communicate well.</p>

<p>Building software is a complicated process with numerous options and trade-offs that can easily be overlooked when communication is poor. As you step into a tech lead role, the way you communicate begins to have greater consequences and impact. Rather than simply making information available for people, you need to think carefully about what and how you communicate to people in order to achieve the desired outcome.</p>

<p>Sometimes, the curse of knowledge (where you develop a cognitive bias that makes it difficult to remember what it was like to be a novice) might kick in as you have access to broader information. You might think your team knows everything that you do or has the same context that you have on a project, when they don’t.</p>

<p>The effect of miscommunication could lead your team to veer away from building the right thing or making technical decisions that don’t align with the project’s direction, which can be expensive to fix later on.</p>

<p>You have to communicate and overcommunicate. Be creative about it at times; this could be as simple as creating a diagram or mock-up screen to pass the information across. When expressing complex ideas, it’s important to take a couple of minutes to think about how you can make it easy for your audience to understand. The worst mistake you can make is to assume people already know. Sometimes people forget things or are lost in other thoughts. So don’t be afraid to repeat yourself to get information to stick.</p>

<h3 id="4-you-should-avoid-premature-fixes"><strong>4. You should avoid premature fixes</strong></h3>

<p>As you step into your new role, you have to learn to wait for things to unfold in order to have a clear picture of what is going on before implementing a fix.</p>

<p>This can be very hard to grasp because it goes against our very instincts as engineers. As makers, we know what it means when we didn’t build to counter various edge cases that could make our application behave in unexpected ways. We know users of software can’t be trusted, so we have to think and anticipate problems in advance and put constraints to guard against them.</p>

<p>As you step from coding software for people to use, to leading people that code software for people to use, rushing to implement constraints to prevent or guard against problems you anticipate can sometimes have unintended consequences.</p>

<p>You can think of constraints as processes, practices, or guidelines enforced and intended to prevent certain problems or lead to certain outcomes.</p>

<p>While a lack of processes can affect your team’s performance and ultimately lead to the development of low-quality products, on other hand, too many processes created in anticipation of problems that are not yet clear can slow and hamper your team’s productivity.</p>

<blockquote>
  <p>‘The Master allows things to happen. She shapes events as they come. She steps out of the way and lets the Tao speak for itself.’ – <em>The Daodejing</em>, Lao Tzu</p>
</blockquote>

<p>Instead of rushing in to create tons of processes right away, wait for problems to surface due to a lack of a process that fixes that problem. You’ll not only be able to craft a more appropriate solution, but you will also easily get your team’s buy-in, and collectively and collaboratively you will be able to define a better approach to fixing the problem.</p>

<p>Not only will your team see the need for it, but they will also understand why it is important. Since they’re involved in providing a solution, they will feel a sense of accountability.</p>

<h3 id="5-you-need-to-learn-to-manage-projects"><strong>5. You need to learn to manage projects</strong></h3>

<p>One important skill that feels very different from your previous role is dipping your feet into a bit of product management.</p>

<p><a href="https://www.pmi.org/about/learn-about-pmi/what-is-project-management">The Project Management Institute</a> (PMI) defines project management as the ’use of specific knowledge, skills, tools, and techniques to deliver something of value to people.’ Because you and your team are responsible for delivering something of value (software) to users and stakeholders, you have an obligation towards your users as well as your stakeholders, which means you need to make sure work gets done.</p>

<p>There will be non-engineering bottlenecks that raise their heads to prevent you from delivering on time or block your team’s progress. These could include dependencies on people, corporate processes, or mismanaged expectations. Now it’s on you to identify and tackle these problems.</p>

<p>Now you need to think about planning and breaking tasks down. You need to inform stakeholders outside your team of what is going on in engineering and, likewise, inform your team of what is going on in other teams.</p>

<p>Management is now a part of your job. Rather than simply handing off anything that has to do with project management to project managers, you now need to work with them and compliment their efforts.</p>

<h3 id="reflections"><strong>Reflections</strong></h3>

<p>The transition from IC to tech lead is never easy, and every journey is unique. There’s no manual and you’ll need to figure things out as you go. Take time to read, experiment, learn, and reflect. You will make mistakes, but what matters most is learning from them. That’s how you’ll grow as a leader.</p>

<hr />

<blockquote>
  <p>Originally published on <a href="https://leaddev.com/professional-development/five-valuable-lessons-new-tech-lead">www.leaddev.com</a>.</p>
</blockquote>]]></content><author><name>James Samuel</name></author><category term="leadership" /><category term="techlead" /><category term="engineeringmanager" /><category term="softwareleads" /><summary type="html"><![CDATA[As you develop your technical skills, you get to a point where you develop incredible confidence about your ability to deliver irrespective of any technical challenge thrown at you.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">11 Blog Posts That Shaped My Leadership Perspectives in 2021</title><link href="https://hubofco.de/leadership/2022/01/06/11-blogposts-that-changed-how-i-lead-software-teams-in-2021/" rel="alternate" type="text/html" title="11 Blog Posts That Shaped My Leadership Perspectives in 2021" /><published>2022-01-06T14:41:00+00:00</published><updated>2022-01-06T14:41:00+00:00</updated><id>https://hubofco.de/leadership/2022/01/06/11-blogposts-that-changed-how-i-lead-software-teams-in-2021</id><content type="html" xml:base="https://hubofco.de/leadership/2022/01/06/11-blogposts-that-changed-how-i-lead-software-teams-in-2021/"><![CDATA[<p>Last year, I broke my records of daily readings. It was the year I read the most books, and it was also the year I read the most blog posts. In 2021, I was fortunate enough to come across great posts and books that shaped my perspectives on leading software teams and helped me become a better software engineer.</p>

<p>In this post, I’ll share some of these blog posts. I believe there are ideas and new learnings in them that will make you a better engineer or engineering manager.</p>

<h1 id="making-the-most-of-these-posts">Making the Most of These Posts</h1>

<p>Frequently, I find myself in a situation where I could use some insights from what I’ve learned, and I completely forgot them. I’ve developed a process that helped me remember what I read and make them stick.</p>

<p>As you read through each post, I recommend doing the following to get the best out of each post.</p>

<p><img src="https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i2lj91yutkli0zgb2eh6.jpeg" alt="Image description" /></p>

<ol>
  <li>
    <p>Know what you’re reading. Read what the introduction is about and note down what you’ll learn at the end. This will prepare your mind for the insights to come.</p>
  </li>
  <li>
    <p>Identify and jot down key ideas and information in the post. Write down learnings, insights, and ideas.  Taking notes of what you learn prompts you to think about your learnings and also improves your memory.</p>
  </li>
  <li>
    <p>Make your notes searchable.</p>
  </li>
  <li>
    <p>Re-read your notes often.</p>
  </li>
</ol>

<h2 id="effective-communication-is-not-about-what-you-say--melanie-ensign"><a href="https://github.com/readme/guides/effective-communication">Effective Communication Is Not About What You Say – Melanie Ensign</a></h2>

<p>Communication is very crucial to leading effectively. But there is a clear difference between merely passing a message across and communicating effectively. The outcome of any communication is what determines how effective it is. Simply pushing out a message or publishing information is not enough to measure its impact. Effective communication is focused on what people need to hear in order to reach the outcome you desire and less on what you say.</p>

<h2 id="you-are-your-own-best-hype-person--marie-chatfield"><strong><a href="https://developer.squareup.com/blog/you-are-your-own-best-hype-person">You Are Your Own Best Hype Person – Marie Chatfield</a></strong></h2>

<p>No one lights a candle and puts it under a bushel. At some point in any engineer’s career, you’re going to have to prove to someone that you’re worth it: worth hiring, worth promoting, worth the raise, worth taking a risk on. The better you’re at making known what you do to others (that’s hyping yourself), the higher your chances of succeeding.</p>

<h2 id="dont-lead-by-example--james-coling"><strong><a href="https://dropbox.tech/infrastructure/dont-lead-by-example">Don’t Lead By Example – James Coling</a></strong></h2>

<p>We’re often told to lead by example. Yes, we all need to set good examples for people to follow. Often time, we overstep our boundaries by continuously taking the lead without setting expectations leaving reports feeling crowded out or disempowered. This post helped me to understand that setting a good example is necessary, but isn’t sufficient for strong technical leadership. Acting like a tech lead means setting clear expectations and embracing direct communication.</p>

<h2 id="leading-without-managing--david-golden"><strong><a href="https://xdg.me/leading-without-managing/">Leading Without Managing – David Golden</a></strong></h2>

<p>Julie Zhuo, in her book, defines Leadership as being able to guide and influence other people. When most think of leading, they think of it as something that people in authority or engineers who wear manager’s or tech lead’s hats do. Contrary to that, anyone can be a leader, and it’s a skill anyone can learn. One of the key points I learned from this post was that anyone could lead without authority or coercive power and how to do just that.</p>

<h2 id="three-crucial-skills-leaders-must-develop-to-become-executives--nikhyl-singhal"><strong><a href="https://theskip.substack.com/p/three-crucial-skills-that-leaders">Three Crucial Skills Leaders Must Develop to Become Executives – Nikhyl Singhal</a></strong></h2>

<p>“Executive skills are subtle and can be elusive to managers, demanding a great deal of focus, courage, and dedication. Becoming a great executive requires a set of skills beyond what’s needed to be a leader”. Nikhyl Singhalin, in this post, shares crucial skills that leaders must develop to become executives. Some of the key takeaways are:</p>

<ul>
  <li>
    <p>Learn to take risks and constantly put yourself in new, challenging, and increasingly ambiguous settings.</p>
  </li>
  <li>
    <p>Learn to build strong relationships with other senior people who work in entirely different functions.</p>
  </li>
  <li>
    <p>Learn to build teams and trust the people in them.</p>
  </li>
</ul>

<h2 id="10-admirable-attributes-of-a-great-technical-lead--elye-medium-paywall"><strong><a href="https://betterprogramming.pub/10-admirable-attributes-of-a-great-technical-lead-251d13a8843b">10 Admirable Attributes of a Great Technical Lead – Elye</a> (Medium Paywall)</strong></h2>

<p>“It takes a lot of effort to be a great tech lead. It’s a delicate balancing act between two poles of the same attribute. If there is too much weight on one side, the person may fall”. This post outlines ten attributes that will make any tech lead a better one.</p>

<h2 id="being-a-chief-technology-officer-lessons-learned-in-my-first-year--shekhar-gulati"><strong><a href="https://shekhargulati.com/2021/01/03/being-chief-technology-officer-lessons-learned-in-my-first-year/">Being a Chief Technology Officer: Lessons Learned in My First Year – Shekhar Gulati</a></strong></h2>

<p>Most people are not lucky to have a manual handed over to them when stepping into a leadership role. As one climbs the ladder, one begins to notice what is required to succeed at the role is slightly different from what makes one an excellent individual contributor. One of the lessons I found valuable in this post includes picking your battles wisely and getting things done without actually doing them.</p>

<h2 id="building-mental-models-of-ideas-that-dont-change--hammad-khalid"><strong><a href="https://shopify.engineering/building-mental-models">Building Mental Models of Ideas That Don’t Change – Hammad Khalid</a></strong></h2>

<p>For Individual Contributors and Engineering Managers, there is always new stuff to learn. If you’re at the beginning of your career, it can be overwhelming. One approach to stay above the water is finding a way to prioritise what to learn and a way of keeping track of principles or ideas that don’t change (that’s mental models) to make better and informed decisions.  This post outlines a list of engineering principles and management principles, some of which took me years to learn.</p>

<h2 id="a-collection-of-how-tos-templates-and-articles-to-help-you-be-the-manager-and-leader-your-team-needs"><strong><a href="https://softstuff.tools/">A Collection of How-Tos, Templates, and Articles to Help You Be the Manager and Leader Your Team Needs</a></strong></h2>

<p>This link contains a collection of todos, how-tos, templates, and articles for engineering managers and tech leads. The categories covered include techniques and templates for:</p>

<ul>
  <li>
    <p>Conducting interviews</p>
  </li>
  <li>
    <p>Conducting meetings</p>
  </li>
  <li>
    <p>Building diversity and inclusion</p>
  </li>
  <li>
    <p>Recruiting</p>
  </li>
  <li>
    <p>Building Culture</p>
  </li>
  <li>
    <p>Growth and more.</p>
  </li>
</ul>

<h2 id="11-top-responsibilities-and-10-common-mistakes-of-a-technical-leader-lorenzo-pasqualis"><strong><a href="https://dev.to/lpasqualis/11-top-responsibilities-and-10-common-mistakes-of-a-technical-leader-9po">11 Top Responsibilities and 10 Common Mistakes of a Technical Leader– Lorenzo Pasqualis</a></strong></h2>

<p>“Similar to building a Lego tower, building software requires making decisions from the beginning to the end. The first responsibility of a technical leader is to define the engineering reality: What needs to be built, the general technical direction, an example of the golden standard for productivity and technical excellence, the business and technological context, and the time and resource constraints.”   If you’re a Tech Lead or an Engineering Manager, you’ll find this post interesting.</p>

<h2 id="first-day-success-manual-for-new-managers--john-reh"><strong><a href="https://www.thebalancecareers.com/succeeding-on-your-first-day-as-a-manager-2276172">First Day Success Manual for New Managers – John Reh</a></strong></h2>

<p>One of the few ways to set yourself for failure when joining a new team is to start by criticizing the existing codebase, past practices and implementing tons of their processes from day one without having a clear picture of how things came to be. Criticizing past practices no matter how ineffective they seem is one of the many ways new hires set themselves up for failure. I love this post because it outlines how to prepare and make a really good impression, from meeting the team to attitude and culture when joining a new team.</p>

<hr />

<h1 id="where-to-go-from-here-join-software-leads-newsletter">Where To Go From Here: Join Software Leads’ Newsletter</h1>

<p>To take your learnings further, Join the <a href="https://softwareleads.substack.com/">Software Leads’ newsletter</a>, a monthly newsletter on software engineering and leadership.</p>

<p>Every month, I publish a new issue that includes links to interesting articles, use cases, and insights from experts on leading teams and solving common software engineering challenges at scale. Together, we learn and become better software engineers, better software leads, better engineering managers, and CTOs by making fewer mistakes as we learn from one another.</p>]]></content><author><name>James</name></author><category term="leadership" /><category term="leadership" /><category term="management" /><category term="career" /><summary type="html"><![CDATA[Last year, I broke my records of daily readings. It was the year I read the most books, and it was also the year I read the most blog posts. In 2021, I was fortunate enough to come across great posts and books that shaped my perspectives on leading software teams and helped me become a better software engineer.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">A Guide to Creating API Products</title><link href="https://hubofco.de/softwareengineering/product/2021/03/13/a-guide-to-creating-api-products/" rel="alternate" type="text/html" title="A Guide to Creating API Products" /><published>2021-03-13T17:53:00+00:00</published><updated>2021-03-13T17:53:00+00:00</updated><id>https://hubofco.de/softwareengineering/product/2021/03/13/a-guide-to-creating-api-products</id><content type="html" xml:base="https://hubofco.de/softwareengineering/product/2021/03/13/a-guide-to-creating-api-products/"><![CDATA[<p><img src="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_200/v1615721592/ApI.png" alt="API Products" /></p>

<p>Building a good Application Programming Interface (API) is more than returning responses. Being a developer and having integrated with tons of APIs, I have noticed a pattern between successful API products and those that are not. It’s about solving problems with great affordance. One might ask the key to building good API products. In this post, I’ll provide some tips that I have found essential when doing so.</p>

<p>After landing my first software engineer role seven years ago, my first big project was building an online book marketplace. In the project, I integrated with multiple Application Programming Interfaces (API). From pre-filling book information using an International Standard Book Number (ISBN) to charging customer credit cards, integration with APIs was key to this project.</p>

<h2 id="working-with-apis">Working with APIs</h2>

<p>Reflecting on my career, I have integrated, built, and exposed multiple APIs. With time, I began to notice two distinct patterns about APIs. Some have certain characteristics that make developers fall in love with them. There is something about them that draws you in, and you feel like drawing others in too. You just want to tell or recommend it to others at no cost.</p>

<p>Likewise, the universe is not also free of APIs that make you feel otherwise. They appear not to live up to their promises. It’s either the documentation provides a generic list of actions you can perform with the API, or it’s hard to understand how the API can solve your problem.  Sometimes, If you manage to scale through the documentation hurdle without issues, the implementation is lurking around the corner to bite you. If you overcome the implementation difficulties, security or support are just around the corner waiting to pounce on you.</p>

<h2 id="guidelines-and-categories">Guidelines and Categories</h2>

<p>I realized there are two categories of APIs: well-designed APIs and poorly designed ones. Until a particular time in my career, I assumed poorly designed APIs were a function of developers who built them. While this may be true in some cases, I have also learned that the fate of an API is not decided during the development phase only, it starts from the time you conceive an idea that needs to expose some interfaces and beyond. I like to see APIs as not just some mere interface or endpoint, at least in this post. I prefer to think of them as products that expose some interfaces and solve a problem irrespective of the technology or protocols. Therefore I will refer to APIs as API products interchangeably.</p>

<p>While there are so many API guidelines and best practices for implementing them, most are geared towards code implementation details. Little has been said on the business side of things. It’s important to note that well-designed API products ride on a rock-solid business strategy.</p>

<p>The API growth rate is increasing at an explosive rate. According to a recent <a href="https://blog.postman.com/api-growth-rate/">Postman API report,</a> between January 2019 and January 2020, there was an increase of more than 100% in API growth.  This growth has been mainly due to the significant shift in the tech industry and consumer behavior <strong>—</strong> the rise in cloud adoption, the increase in microservice architecture adoption, and consumers’ love for multi-device experience.</p>

<h3 id="to-put-these-in-a-better-way">To put these in a better way:</h3>

<ul>
  <li>
    <p>Consumers are moving from a single to multi-device experience, which drives a need for systems to be integrated.</p>
  </li>
  <li>
    <p>Businesses are moving away from the monolithic way of building applications to a more decentralized architecture.</p>
  </li>
  <li>
    <p>Organizations are shifting from on-premise data-center to the cloud.</p>
  </li>
</ul>

<p>The tech industry’s significant shifts and <a href="https://www.mindtheproduct.com/understanding-users-learnings-from-the-mtpcon-session-speakers/">consumer behavior</a> further buttress that the API economy has come to stay, and businesses benefiting the most tend to create API products that developers love to use.</p>

<p>Every successful product either solves a problem or makes solving a problem easy. Such is true with API products. Good API products are sufficiently capable of solving the problem they’re designed to solve without compromising availability and security. They are easy to use, easy to evolve, and hard to misuse.</p>

<h2 id="understand-the-why-and-how">Understand the “Why” and “How”</h2>

<p>It’s not too uncommon to see APIs that no one understands the problems they are trying to solve. Undeniably, API products are no different from other great products. Great products have a rock-solid strategy — the “why” and “how” are clearly defined and understood. Before you set out to build an API, you first need to ask yourself: why do I want to build an API? What problem will the API solve or make it easy to solve? How will the API help me achieve my goals?  If you’re unable to understand the business needs for an API,  you should engage the right people and figure them out before designing it.  Your strategy will transverse down from the business to implementation details. When your strategy is clear, It will challenge developers to think creatively in achieving the end goals.</p>

<h2 id="design-matters">Design Matters</h2>

<p>Mehdi Medjaoui, in his book <a href="https://www.amazon.com/Continuous-API-Management-Decisions-Landscape/dp/1492043559">Continuous API Management</a>, describes the design as something you do when you make decisions about how something you’re creating will look, feel, and be used.  Each time you add or update an API, you’re actively making design decisions for better or worse. This is why you have to make a deliberate effort at the design stage — you can’t rush it.</p>

<p>When designing your API, you should pay attention to the following:</p>

<ul>
  <li>
    <p>Vocabularies: Are your words and terms easy to understand for your users?</p>
  </li>
  <li>
    <p>Styles: What protocols are you supporting, Rest, or GraphQL?</p>
  </li>
  <li>
    <p>Naturalness: Do your users have to change their usual ways of solving their problems significantly? Did you follow established standards and conventions?</p>
  </li>
  <li>
    <p>Consistency: What level of familiarity will you provide? Are your APIs similar to what your users may have used in the past?</p>
  </li>
</ul>

<h2 id="get-good-at-documentation">Get Good at Documentation</h2>

<p>Documentation is an important aspect of building API products that people love to use. The year 2020<a href="https://www.postman.com/state-of-api/executing-on-apis/#executing-on-apis"> state of the API report</a> showed that documentation is the highest obstacle to consuming APIs. If you’re a developer, you understand how frustrating a lack of sufficient documentation can be when integrating with APIs.  Be aware that people will read your API documentation for three primary reasons: evaluation, integration, and debugging.</p>

<p>Evaluation is what people do when they are trying to see if your API product solves their problem. Once your API is evaluated, users will need your documentation for integration. Once your API is embedded in their application, they will visit it when something goes wrong.</p>

<p>If you’re providing only one type of documentation, you’re undeserving your users. Teams must understand that documentation isn’t just a “dump” of API parameters. There are marketing and customer support aspects that should be included. Besides your endpoints, errors, requests, and response structures, your documentation should also tell the story, how you see the problems, and your solutions to solve those problems.</p>

<h2 id="get-development-right">Get Development Right</h2>

<p>This is one of the most important phases in API development. This is where your ideas, strategies are turned into implementations. The development phase contains many vital decisions that are hard to reverse if not made right. You will need to make some internal decisions like what technologies to use, protocols, runtimes, and where to run the <a href="https://www.mindtheproduct.com/a-product-managers-approach-to-building-integrations-for-saas-software/">software</a> that powers your API. End users don’t care whether you use NoSQL or MySQL as your datastore. What’s important when you make technology decisions is that you choose a technology your team can support.</p>

<p>The quality of tech decisions you make transcends what is needed to launch your product’s first version.  Launching the first version of your API is one part of the job. You have to make decisions about your API’s quality, reliability, changeability, and maintainability over a lifetime. <a href="https://www.postman.com/state-of-api/executing-on-apis/#executing-on-apis">API performance and uptime </a>are considered to be some of the major determinants of whether an API product meets consumers’ expectations. This sheds light on how consumers will assess your API.</p>

<h2 id="tracking-effectiveness">Tracking Effectiveness</h2>

<p>You also have to decide what to test and how you will test it to build confidence in your API. In summary,  your tests should be able to cover the following:</p>

<ul>
  <li>
    <p>Can the API Product deliver on the strategic goals?</p>
  </li>
  <li>
    <p>Is the API Product quality enough to support the strategic goals?</p>
  </li>
</ul>

<p>APIs are meant to be simple, they hide complexities and do not expose them. I like to compare APIs to an electric plug — simple to use without any complicated manuals —  an electric plug does not care about the source of electricity, neither does it care about the change over switch installed or the electricity provider. It’s a conduit for electricity–plug, and it just works. Admittedly, modern systems are complex, but the complexity should not leak to the interfaces.</p>

<h2 id="in-conclusion">In Conclusion</h2>

<p>Building a good API is more than accepting requests and returning responses. That’s far from the goal. It’s about solving problems with great affordance.  You have to align usability with purpose continuously.  It’s important to note that people will use your API differently. There will be different problems people use your API to solve. You can’t change that. What you can not do is be everything to everyone.</p>

<p>Finally, you have to go out there and let developers understand your products’ benefits by connecting with them at various events. You have to let them see how your products make their lives easier. The tech community is competitive. It’s no longer a matter of <em>if you build it, they will come</em>.  But instead, <em>build, take it to them and beyond</em>.</p>

<blockquote>
  <p>Originally published by me on <a href="https://www.mindtheproduct.com/a-guide-to-creating-api-products/">www.mindtheproduct.com</a>, [03/02/2021].</p>
</blockquote>]]></content><author><name>samuel</name></author><category term="SoftwareEngineering" /><category term="Product" /><category term="ApI" /><category term="API products" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_200/v1615721592/ApI.png" /><media:content medium="image" url="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_200/v1615721592/ApI.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Choosing the Right Cloud Provider – Containers</title><link href="https://hubofco.de/softwareengineering/2021/02/15/choosing-the-right-cloud-provider-containers/" rel="alternate" type="text/html" title="Choosing the Right Cloud Provider – Containers" /><published>2021-02-15T07:15:00+00:00</published><updated>2021-02-15T07:15:00+00:00</updated><id>https://hubofco.de/softwareengineering/2021/02/15/choosing-the-right-cloud-provider-containers</id><content type="html" xml:base="https://hubofco.de/softwareengineering/2021/02/15/choosing-the-right-cloud-provider-containers/"><![CDATA[<p>Prior to container technology, making applications run on different environments was one of the greatest struggles for a developer. “It runs on my machine” was a common frustration you heard from engineers—including me, many times.</p>

<p>Like numerous engineers, I’ve built a working code on my machine that refused to then run in other environments due to the disparity of environments, that is, different configurations and dependencies.</p>

<blockquote>
  <p>Editorial note:  This post was initially published on Iamondemand’s blog. You can check out the <a href="https://iamondemand.com/blog/which-cloud-provider-is-right-for-you-an-iod-series-part-2/">original here</a>, at their site.</p>
</blockquote>

<p><a href="https://www.docker.com/resources/what-container">Docker</a> defines containers as “a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.”</p>

<p>By bundling an entire runtime environment, including an application plus all its configurations, binary files, dependencies, and libraries needed to run it into a single image, container technology eliminates the “but it runs on my local machine” problem and allows you to adopt the “build once, deploy anywhere” principle.</p>

<p>Containers provide a consistent deployment environment that development teams can use across software development and delivery pipelines.</p>

<p>The demand today for tolerant, reliable, and scalable software has given rise to thousands of containers running in a typical production environment, which brings a new problem: management. Container orchestration tools like <a href="https://iamondemand.com/blog/5-facts-you-need-know-about-k8s-til-now/">Kubernetes</a> solve this issue.</p>

<p>Kubernetes lets you run and manage containers and deploy stateful and stateless applications with multiple cloud providers. With Kubernetes, you can run a single application across a fleet of machines or different computing environments. Furthermore, you can either deploy your applications to a Kubernetes cluster in the cloud or on-premises. Of course, the major challenge with cloud deployments is selecting the right provider out of the many out there that offer managed Kubernetes as a service.</p>

<p>Before making a decision, it’s important you examine each service provider to identify which one best suits your organization’s needs. In this post, I’ll discuss the top Kubernetes as a Service offerings, their similarities, key differences, and how to choose the right one.</p>

<h1 id="how-to-choose-the-right-kubernetes-provider">How to Choose the Right Kubernetes Provider</h1>

<p>Most cloud providers have their own set of Kubernetes hosting environments for managing containers. Although Kubernetes offerings vary wildly among providers, here are some factors to consider as you shop for the right platform to manage your container workloads:</p>

<h2 id="data-security">Data Security</h2>

<p>Before selecting a Kubernetes service provider, you should understand each provider’s security governance process, measures, and mechanisms for preserving your data and application. An ideal Kubernetes service provider should align with strict security compliance and industry best practices. It should also implement some sort of confidentiality for sensitive data and provide administrative rights for restricting, monitoring, blocking, and approving users’ access to data.</p>

<h2 id="performance-and-reliability">Performance and Reliability</h2>

<p>Reliability is an important requirement of any application running in the cloud. You should ascertain the reliability of cloud service providers by comparing their performance against their service-level agreements for the last 6 to 12 months. Cloud service providers usually publish this information, and even if they don’t, you can always request it.</p>

<p>You shouldn’t expect 100% perfection because every provider will experience downtime at some point. A good example was the<a href="https://techcrunch.com/2020/12/14/gmail-youtube-google-docs-and-other-services-go-down-simultaneously-in-multiple-countries/"> recent Google incident</a>. What matters the most is how often downtimes happen and how a provider handles them when they do occur. An ideal Kubernetes service provider should have documented, proven, and established processes for handling unplanned and planned downtime.</p>

<h2 id="standards-and-certifications">Standards and Certifications</h2>

<p>Before going forward with a Kubernetes provider, you need to ensure that the provider adheres to the best industry practices and recognized standards. It’d be helpful if you also understood how the provider plans to continuously adhere to these standards. An ideal Kubernetes provider should have good knowledge management, effective data management, service status visibility, and structured processes.</p>

<p>Now that you’re familiar with the criteria to consider while searching for the right Kubernetes Managed Service Provider, the next step is to identify the market’s biggest players.</p>

<h1 id="top-options-for-kubernetes-as-a-service">Top Options for Kubernetes as a Service</h1>

<p>There are many cloud services for running container workloads using Kubernetes, but let’s closely examine the Kubernetes solution from the Big 3 players: Amazon’s Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), and Microsoft’s Azure Kubernetes Service (AKS).</p>

<h2 id="amazon-elastic-kubernetes-service">Amazon Elastic Kubernetes Service</h2>

<p><a href="https://aws.amazon.com/eks/?whats-new-cards.sort-by=item.additionalFields.postDateTime&amp;whats-new-cards.sort-order=desc&amp;eks-blogs.sort-by=item.additionalFields.createdDate&amp;eks-blogs.sort-order=desc">Amazon Elastic Kubernetes Service (EKS)</a> is AWS’s managed container offering that allows developers to start, run, and scale Kubernetes apps on-premises or in the AWS cloud. EKS also allows you to plan, schedule, and execute your container workloads across compute services and features available on AWS, such as Spot Instances, Amazon Fargate, and EC2.</p>

<p>With Amazon EKS, you can easily manage your applications and Kubernetes clusters across hybrid environments too, plus you can use the Kubernetes jobs to run parallel or sequential batch workloads on your EKS cluster. Common use cases for Amazon EKS include hybrid development, batch processing, machine learning, and web applications. The fee charged by AWS is $0.10/hour for each EKS cluster created.</p>

<h2 id="strengths">Strengths</h2>

<ul>
  <li>
    <p>Highly configurable and customizable</p>
  </li>
  <li>
    <p>Efficiently provisions and scales your Kubernetes application</p>
  </li>
</ul>

<h2 id="limitations">Limitations</h2>

<ul>
  <li>
    <p>Complexity involved with addition and customization of node pools</p>
  </li>
  <li>
    <p>No automatic node updates</p>
  </li>
</ul>

<h2 id="google-kubernetes-engine-gke">Google Kubernetes Engine (GKE)</h2>

<p><a href="https://cloud.google.com/kubernetes-engine">Google Kubernetes Engine (GKE)</a> is Google’s managed offering that allows you to deploy, manage, and scale containerized applications. The GKE environment comprises multiple Compute Engine instances grouped together to form clusters that are powered by Kubernetes. By running a GKE cluster, you get to benefit from Google’s advanced cluster management features, including automatic upgrades and scaling, logging and monitoring, node auto-repair, load-balancing, and more.</p>

<p>With Google Kubernetes Engine, you can create clusters based on your budget and the availability requirements of your workload. The two available clusters include: regional and zonal (multi-zonal and single-zonal). GKE has a financially backed Service Level Agreement that guarantees 99.5% availability and 99.95% availability for<a href="https://cloud.google.com/kubernetes-engine/docs/concepts/types-of-clusters"> zonal clusters</a>and <a href="https://cloud.google.com/kubernetes-engine/docs/concepts/types-of-clusters">regional clusters</a>, respectively. Just like Amazon, Google charges $0.10/hour for every GKE cluster that you create, although there is no charge for Anthos GKE Clusters.</p>

<h2 id="strengths-1">Strengths</h2>

<ul>
  <li>
    <p>Easy Cluster creation and management</p>
  </li>
  <li>
    <p>Automatic cluster patching and upgrades</p>
  </li>
</ul>

<h2 id="limitations-1">Limitations</h2>

<ul>
  <li>
    <p>Difficulty in customizing server configurations</p>
  </li>
  <li>
    <p>No adequate documentation for supportive information</p>
  </li>
</ul>

<h2 id="microsoft-azure-kubernetes-service-aks">Microsoft Azure Kubernetes Service (AKS)</h2>

<p><a href="https://azure.microsoft.com/en-us/services/kubernetes-service/">Azure Kubernetes Service (AKS)</a> is a highly secure, available, and fully managed Kubernetes service that allows you to deploy and manage containerized apps more easily. AKS offers an integrated CI/CD experience, serverless Kubernetes, enterprise-grade governance, and security. Your DevOps team can leverage AKS to build, deliver, and scale applications fast.</p>

<p>AKS has a financially backed Service Level Agreement that guarantees 99.95% availability for Kubernetes clusters using Azure Availability Zone and 99.9% availability for Kubernetes clusters that don’t. Unlike other providers, Azure doesn’t charge for Kubernetes cluster management. However, you pay for the resources you use, like networking, storage resources, and virtual machine instances.</p>

<h2 id="strengths-2">Strengths</h2>

<ul>
  <li>
    <p>Faster end-to-end development experience via Azure DevOps, Kubernetes tools, Visual Studio Code, and Azure Monitor</p>
  </li>
  <li>
    <p>Advanced access and identity management using Azure Active Directory</p>
  </li>
</ul>

<h2 id="limitations-2">Limitations</h2>

<ul>
  <li>
    <p>No automatic node updates</p>
  </li>
  <li>
    <p>Inability to update availability zone settings after a cluster is created</p>
  </li>
</ul>

<h2 id="comparison-criteria">Comparison Criteria</h2>

<table>
  <thead>
    <tr>
      <th>Comparison Criteria</th>
      <th>EKS</th>
      <th>GKE</th>
      <th>AKS</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Kubernetes Release</td>
      <td>1.15, 1.16, 1.17, 1.18</td>
      <td>1.16, 1.17, 1.18</td>
      <td>1.15, 1.16, 1.17, 1.18, 1.19</td>
    </tr>
    <tr>
      <td>Cluster management</td>
      <td>$0.10/hour</td>
      <td>$0.10/hour</td>
      <td>Free</td>
    </tr>
    <tr>
      <td>SLA</td>
      <td>99.9%</td>
      <td>99.95% for regional deployments and 99.5% for zonal ones</td>
      <td>99.95% for clusters that use Azure Availability Zone and 99.9% for clusters that don’t</td>
    </tr>
    <tr>
      <td>Year Started</td>
      <td>2018</td>
      <td>2014</td>
      <td>2018</td>
    </tr>
    <tr>
      <td>Serverless compute</td>
      <td>Fargate</td>
      <td>Cloud Run</td>
      <td>Azure Container Instances</td>
    </tr>
    <tr>
      <td>Bare Metal Nodes</td>
      <td>Yes</td>
      <td>No</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Resource monitoring</td>
      <td>Third-party (Prometheus)</td>
      <td>Stackdriver</td>
      <td>Azure Monitor</td>
    </tr>
  </tbody>
</table>

<p>Google, Amazon, and Microsoft Azure all offer Managed Kubernetes solutions, but you’re at liberty to select the provider whose offerings meet your organization’s needs and requirements. If you don’t want to use EKS, AKS, and GKE, the three cloud providers do have other alternatives that you can use to run container workloads.</p>

<p>Alternative to Kubernetes for Running Container Workloads</p>

<p>There are two ways to run container workloads on the cloud: using managed Kubernetes services like AKS, GKE, EKS or by <a href="https://iamondemand.com/blog/what-why-how-run-serverless-kubernetes-pods-using-amazon-eks-and-aws-fargate/">leveraging the serverless container services</a> provided by the cloud providers. If you don’t plan to run on the cloud, you can also adopt a self-hosted solution like OpenShift to run container workloads.</p>

<p>By implementing serverless container services, you can manage the life cycle and availability of your container workloads. Examples of serverless container services include Google Cloud Run, AWS Fargate, and Azure Container Services.</p>

<h2 id="aws-fargate">AWS Fargate</h2>

<p><a href="https://aws.amazon.com/fargate/?whats-new-cards.sort-by=item.additionalFields.postDateTime&amp;whats-new-cards.sort-order=desc&amp;fargate-blogs.sort-by=item.additionalFields.createdDate&amp;fargate-blogs.sort-order=desc">AWS Fargate</a> is a serverless compute engine that lets you run containers without the need to provision, configure, or scale clusters of your virtual machines. With Fargate, you don’t have to choose server types, decide when to optimize cluster packing, or scale your cluster. Development teams can thus focus on building and operating applications while Fargate manages all the infrastructure and scaling needed to run the app with high availability.</p>

<h2 id="google-cloud-run">Google Cloud Run</h2>

<p><a href="https://cloud.google.com/run">Cloud Run</a> is a serverless compute platform that allows you to build and deploy highly scalable applications without the need for Kubernetes-based or VM deployments. With Cloud Run, you can build applications with your favorite tools/dependencies and in your favorite language and deploy the apps in seconds.</p>

<p>Cloud Run offers several benefits, including auto-scaling, no up-front provisioning, and zero server management. It’s ideal for use cases like mobile backends, web applications, stateless HTTP applications, streaming, batch data processing, etc.</p>

<h2 id="azure-container-instances">Azure Container Instances</h2>

<p><a href="https://docs.microsoft.com/en-us/azure/container-instances/">Azure Container Instances (ACI)</a> is a serverless compute platform that enables you to easily run containers without having to handle the management of the virtual servers. ACI lets you deploy containers in the cloud with remarkable speed and simplicity. It also enables you to secure applications with hypervisor isolation.</p>

<p>By running your application in <a href="https://iamondemand.com/blog/running-kubernetes-windows-containers-on-the-azure-cloud/">Azure Container Instances</a>, you can focus on developing applications without worrying about managing the infrastructure that runs the app. ACI is suitable for <a href="https://azure.microsoft.com/en-us/services/container-instances/#documentation">data-processing jobs, building event-driven applications, and elastic bursting</a>.</p>

<h1 id="wrapping-up">Wrapping Up</h1>

<p>Containers solve the problem associated with the multiple, different environments that development teams must deal with today, letting them build and deploy the same application for development, testing, and production environments. And Kubernetes allows you to orchestrate these containers.</p>

<p>Although there are several cloud service providers for your container workload orchestration, you need to compare what each provider brings to the table. By understanding the strengths and limitations of the various managed Kubernetes services, you’ll be well positioned to select the right provider that perfectly suits your organization’s needs.</p>]]></content><author><name>samuel</name></author><category term="SoftwareEngineering" /><category term="containers" /><category term="cloud" /><summary type="html"><![CDATA[Prior to container technology, making applications run on different environments was one of the greatest struggles for a developer. “It runs on my machine” was a common frustration you heard from engineers—including me, many times.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">The 3 Pillars of System Observability: Logs, Metrics, and Tracing</title><link href="https://hubofco.de/softwareengineering/2021/02/14/the-3-pillars-of-system-observability-logs-metrics-and-tracing/" rel="alternate" type="text/html" title="The 3 Pillars of System Observability: Logs, Metrics, and Tracing" /><published>2021-02-14T10:14:00+00:00</published><updated>2021-02-14T10:14:00+00:00</updated><id>https://hubofco.de/softwareengineering/2021/02/14/the-3-pillars-of-system-observability-logs-metrics-and-tracing</id><content type="html" xml:base="https://hubofco.de/softwareengineering/2021/02/14/the-3-pillars-of-system-observability-logs-metrics-and-tracing/"><![CDATA[<p><img src="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_300/v1613297562/image1.png" alt="" />
Microservice architecture has become the new model for building modern-day applications. While decoupled services are easy to scale and manage, increasing interactions between those services have created a new set of problems. It’s no surprise that debugging was listed as a major challenge in the <a href="https://tsh.io/blog/what-are-microservices-in-2020-key-findings-from-survey-report/">annual state of microservices report</a>.</p>

<blockquote>
  <p>Editorial note:  This post was initially published on Iamondemand’s blog. You can check out the <a href="https://iamondemand.com/blog/the-3-pillars-of-system-observability-logs-metrics-and-tracing/">original here</a>, at their site.</p>
</blockquote>

<p>When your systems are distributed, various things can go wrong. Even if you’ve written the perfect code, a node may fail, a connection may timeout, or participant servers may act arbitrarily. The bottom line is that things will break. And when they do, you want to be able to identify and fix the problem as soon as possible before it alters the entire system’s performance, or affects customers or your organization’s reputation. For this reason, we need observability to run today’s services and infrastructure.</p>

<p>People have varying knowledge of what observability means. For some engineers, it’s the old wine of <a href="https://iamondemand.com/blog/how-to-properly-monitor-your-k 8s-clusters-and-pods/">monitoring</a> in a pristine bottle. For others, it’s an umbrella concept that includes log analysis, trace analysis for distributed systems, visualization, and alerts management.</p>

<p>Honeycomb, in its <a href="https://www.honeycomb.io/wp-content/uploads/2018/07/Honeycomb-Guide-Achieving-Observability-v1.pdf">Guide to Achieving Observability</a>, also defines observability as the ability to ask arbitrary questions about your production environment without having to know beforehand what you wanted to ask. Despite the variability in these definitions, they both explain the overarching goal of observability, which is to achieve better, unprecedented visibility into systems.</p>

<p>Observability is a system that enables you to understand what’s really happening in your software, from the outside. An observable system provides all the information you need in real time to address the day-to-day questions you might have about a system. It also enables you to navigate from effect to cause whenever the system develops a fault.</p>

<p>An effective observability solution may address questions like:</p>

<ul>
  <li>Why is “y” broken?</li>
  <li>What went wrong during the release of feature “x”?</li>
  <li>Why has system performance degraded over the past few months?</li>
  <li>What did my service look like at point “y”?</li>
  <li>Is this system issue affecting specific users or all of them?</li>
</ul>

<p>You may ask why you should put in the effort to make your systems observable. I discuss the reasons in the following sections.</p>

<h1 id="the-unpredictability-of-distributed-systems">The Unpredictability of Distributed Systems</h1>

<p>A <a href="https://en.wikipedia.org/wiki/Distributed_computing">distributed system</a> comprises multiple components located on different networked computers, communicating and coordinating their actions by passing messages from one to another. The inherent integrations and nature of distributed systems lead to “layers of distinct ownership,” which are sometimes challenging to manage. By implementing observability across a development environment, you’re able to understand your system’s failure modes and trace issues to their root cause.</p>

<h1 id="the-need-to-always-maintain-software">The Need to Always Maintain Software</h1>

<p>After an application is designed, implemented, and launched, it also has to be maintained. Development teams need to continuously adapt to changing customer behavior and ensure that the application runs at peak performance levels and works as expected. An observable system makes application maintenance easier. You’re able to easily identify errors, fix bugs that might arise, customize the application to users’ needs, and eventually improve the app’s performance.</p>

<h1 id="the-need-to-speed-up-code-development-and-deployment">The Need to Speed Up Code Development and Deployment</h1>

<p>DevOps is about speed: faster application development and shipments, more immediate updates, and continuous development—all of which lead to a <a href="https://www.microtool.de/en/knowledge/devops-for-faster-software-development-and-deployment/">shortened development lifecycle and provide faster and continuous delivery</a> with high software quality. If your development teams cannot identify and address issues before they occur, or they cannot react quickly to changes, it may be difficult to speed up the time to market. By leveraging a strong observability platform, software development teams can increase speed and effectiveness in deploying code, updating, and tracking changes.</p>

<p>In a nutshell, observability offers many benefits to managers and engineers. It also plays a vital role in an organization—especially in maintaining and enhancing a system’s overall architecture and performance. As distributed systems become more complex, engineers can infer the system’s internal states through externally available knowledge, including tracing data, log data, monitoring data, and so on.</p>

<h1 id="pillars-of-observability">Pillars of Observability</h1>

<p>Observability is about instrumenting systems to gather useful data that tells you when and why systems are behaving in a certain way. Whenever a system develops a fault, you should first examine telemetry data to better understand why the error occurred.</p>

<p>The following telemetry data—commonly referred to as the pillars of observability—are key to achieving observability in distributed systems:</p>

<ul>
  <li>Logs</li>
  <li>Metrics</li>
  <li>Tracing</li>
</ul>

<p><img src="https://res.cloudinary.com/samueljames/image/upload/v1613297562/image1.png" alt="img" />Figure 1: Three pillars of observability</p>

<h2 id="logs">Logs</h2>

<p>Logs are structured and unstructured lines of text a system produces when certain codes run. In general, you can think of a log as a record of an event that happened within an application. Logs help uncover unpredictable and emergent behaviors exhibited by components of microservices architecture.</p>

<p>They’re easy to generate and instrument. And most application frameworks, libraries, and languages come with support for logging. Virtually every component of a distributed system generates logs of actions and events at any point.</p>

<p>Log files provide comprehensive system details, such as a fault, and the specific time when the fault occurred. By analyzing the logs, you can troubleshoot your code and identify where and why the error occurred. Logs are also useful for troubleshooting security incidents in load balancers, caches, and databases.</p>

<h2 id="metrics">Metrics</h2>

<p>Metrics are a numerical representation of data that can be used to determine a service or component’s overall behavior over time. Metrics comprise a set of attributes (e.g., name, value, label, and timestamp) that convey information about <a href="https://cloud.google.com/blog/products/gcp/sre-fundamentals-slis-slas-and-slos">SLAs, SLOs, and SLIs.</a></p>

<p>Unlike an event log, which records specific events, metrics are a measured value derived from system performance. Metrics are real time-savers because you can easily correlate them across infrastructure components to get a holistic view of system health and performance. They also enable easier querying and longer retention of data.</p>

<p>You can gather metrics on system uptime, response time, the number of requests per second, and how much processing power or memory an application is using, for example. Typically, SREs and ops engineers use metrics to trigger alerts whenever a system value goes above a specified threshold.</p>

<p>For example, let’s say you want to monitor the requests per second in an HTTP service. You notice an abrupt spike in traffic, and you want to know what’s happening in your system. Metrics provide deeper visibility and insight that help you understand the cause of the spike. The spike could be due to an incorrect service configuration, malicious behavior, or issues with other parts of your system. In addition to providing visibility, you can also use the information to detect and determine the severity of issues.</p>

<h2 id="trace">Trace</h2>

<p>Although logs and metrics might be adequate for understanding individual system behavior and performance, they rarely provide helpful information for understanding the lifetime of a request in a distributed system. To view and understand the entire lifecycle of a request or action across several systems, you need another observability technique called tracing.</p>

<p>A trace represents the entire journey of a request or action as it moves through all the nodes of a distributed system. Traces allow you to profile and observe systems, especially containerized applications, serverless architectures, or microservices architecture. By analyzing trace data, you and your team can measure overall system health, pinpoint bottlenecks, identify and resolve issues faster, and prioritize high-value areas for optimization and improvements.</p>

<p><a href="https://iamondemand.com/blog/open-source-distributed-tracing-why-you-need-it-how-to-get-started/">Traces are an essential pillar of observability</a> because they provide context for the other components of observability. For instance, you can analyze a trace to identify the most valuable metrics based on what you’re trying to accomplish, or the logs relevant to the issue you’re trying to troubleshoot.</p>

<p>Tracing is better suited for debugging and monitoring complex applications that contend for resources (e.g., a mutex, disk, or network) in a nontrivial manner. Tracing provides quick answers to the following questions in distributed software environments:</p>

<ul>
  <li>Which services have inefficient or problematic code that should be prioritized for optimization?</li>
  <li>How is the health and performance of services that make up a distributed architecture?</li>
  <li>What are the performance bottlenecks that could affect the overall end-user experience?</li>
</ul>

<p>Although logs, traces, and metrics each serve their own unique purpose, they all work together to help you better understand the performance and behavior of distributed systems. If your organization already uses microservice or serverless architecture, or if they plan to adopt containers and microservices, a combination of all the telemetry data will provide the detailed information needed to understand and debug your system.</p>

<h1 id="implementing-observability">Implementing Observability</h1>

<p>The fundamental process of implementing an observability-centric culture across your development environment relies on adopting tools that enable you to support and derive insights from the pillars of observability outlined above.</p>

<p>First, you need to select a modern observability platform that perfectly fits your specific needs. A full-stack platform that condenses all the telemetry data from metrics, logs, and traces into a visual, intuitive, and easy-to-understand dashboard can be a good starting point. (Just be careful not to choose a platform that locks you into a specific cloud platform.)</p>

<p>Then, you need to identify and monitor metrics related to issues you’ve already experienced and those you could likely encounter in the future.</p>

<p>After integrating observability into your incident management process, the next step is to establish an observability-centric culture across your organization. Absent of this, you won’t be able to get the best out of tracing, metrics, and log management. No amount of observability tooling can serve as a substitute for sound engineering instincts and intuition.</p>

<p>By selecting the right observability platform, getting a connected view of your system’s performance data in a singular view, and establishing an observability culture, you’ll be able to identify issues faster, understand what caused them, and eventually, build customer-focused products at greater speeds.</p>

<h1 id="conclusion">Conclusion</h1>

<p>As the adoption of microservices and containers increases, engineers and managers need to develop a culture of observability. Doing so will help your teams <a href="https://newrelic.com/resources/ebooks/what-is-observability">get the best out of your investments in the cloud</a>. It will help you form a continuous innovation culture that delivers high-end software to customers.</p>

<p>It doesn’t matter if you’re a seasoned DevOps pro or just getting started with observability. Understanding the three pillars of observability is an essential part of developing an observability-centric culture. Practice what you’ve learned in this post, and you’ll be on your way to achieving observability in distributed systems.</p>]]></content><author><name>samuel</name></author><category term="SoftwareEngineering" /><category term="observability" /><category term="tracing" /><category term="logs" /><category term="metrics" /><summary type="html"><![CDATA[Microservice architecture has become the new model for building modern-day applications. While decoupled services are easy to scale and manage, increasing interactions between those services have created a new set of problems. It’s no surprise that debugging was listed as a major challenge in the annual state of microservices report.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_300/v1613297562/image1.png" /><media:content medium="image" url="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_300/v1613297562/image1.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">What Is Amazon SageMaker, and How Do I Use It?</title><link href="https://hubofco.de/machinelearning/2021/02/14/what-is-amazon-sagemaker-and-how-do-i-use-it/" rel="alternate" type="text/html" title="What Is Amazon SageMaker, and How Do I Use It?" /><published>2021-02-14T10:04:00+00:00</published><updated>2021-02-14T10:04:00+00:00</updated><id>https://hubofco.de/machinelearning/2021/02/14/what-is-amazon-sagemaker-and-how-do-i-use-it</id><content type="html" xml:base="https://hubofco.de/machinelearning/2021/02/14/what-is-amazon-sagemaker-and-how-do-i-use-it/"><![CDATA[<p><img src="https://res.cloudinary.com/samueljames/image/upload/c_scale,w_300/v1613297243/pasted-image-0.png" alt="" />
We’re in the fourth industrial age, where data is accumulated, collated, analyzed, and interpreted at lightning-fast speeds. Highly competitive <a href="https://iamondemand.com/blog/5-must-have-dsml-platform-capabilities-to-stay-competitive-in-2020-and-beyond/">organizations enhance their platforms with AI</a>, but many companies are unable to process the information they collect and use it in a meaningful way because they have insufficient computing resources, storage, and even availability. That’s where cloud-native AI and machine learning (ML) like Amazon SageMaker come into play. In this post, you’ll learn about Amazon SageMaker—what it is and how to use it.</p>

<blockquote>
  <p>Editorial note:  This post was initially published on Iamondemand’s blog. You can check out the original <a href="https://iamondemand.com/blog/what-is-amazon-sagemaker-and-how-do-i-use-it-2/">here</a>, at their site.</p>
</blockquote>

<h1 id="machine-learning-in-the-cloud">Machine Learning in the Cloud</h1>

<p>Cloud-based machine learning enables organizations to access high-performance resources and infrastructures that they couldn’t use (or even afford) on their own. Today, cloud use involves networking, storage, and computing, but when cloud computing is combined with ML to form the “intelligent cloud,” its capabilities increase and it is much easier to protect, scale, and handle. With the great strides taking place in cloud and machine learning development, their future seems ever-more tied together.</p>

<p>Although cloud computing is not new, some organizations have yet to embrace it. The question that comes to mind is: What do you gain by running ML workloads in the cloud? Below are some answers.</p>

<h1 id="reduced-costs">Reduced Costs</h1>

<p>Hosting your learning workloads on the cloud will save you a considerable amount in capital costs because you won’t need any physical hardware investments.</p>

<p>Also, you won’t need to hire professional personnel to maintain the hardware, since cloud service providers buy and manage the hardware equipment themselves.</p>

<h1 id="availability">Availability</h1>

<p>In an on-premise work environment, anything can happen. Things break apart: processors burn, hard drives get corrupted, and networks get interrupted.</p>

<p>The cloud aims to deliver abstract computing space, memory, and power, making you forget about the challenges of the physical world. Fortunately, most cloud service providers are reliable and maintain 99.9% uptime, so your Machine Learning team or engineers can always get their work done on the cloud. And of course, you can access ML services on the cloud from anywhere at any time.</p>

<h1 id="elasticity">Elasticity</h1>

<p>Often, businesses don’t predict their future needs correctly and purchase more hardware than they need or else don’t have enough servers to handle computing tasks. Sadly, as they try to correct their errors, the process becomes more and more expensive.</p>

<p>The cloud enables businesses to experiment with different ML technologies and scale up or down as and when needed. With nominal monthly fees, they can expand capacity.</p>

<p>Of course, the resources of every cloud service provider are limited, but they’re still far beyond what 99% of businesses will need. A good cloud computing system has many different tools, bandwidths, and storage space options.</p>

<h1 id="on-demand-self-service">On-Demand Self-Service</h1>

<p>When machine learning workloads are hosted in the cloud, you can access them at any time without administrator approval, as you would need with on-premise data centers. Cloud computing achieves this through automation and self-service.</p>

<p>On-demand self-service also helps businesses to plan. For instance, you can request new virtual machines or storage capacities at any time and expect to have them within seconds.</p>

<h1 id="broad-network-access">Broad Network Access</h1>

<p>With machine learning workloads hosted in the cloud, you’ll have unlimited data access privileges. Machine learning uses data as its input for smarter decision-making and improved performance, and cloud environments can hold massive volumes of data. That means that using ML in the cloud will help you overcome the delay in traditional silos and accessibility. ML and AI are also great instruments for transferring data between the on-premise infrastructure and cloud environment.</p>

<h1 id="amazon-sagemaker-cloud-ml-service">Amazon SageMaker Cloud ML Service</h1>

<p>Amazon SageMaker is a fully-managed service that helps data scientists, analysts, and software developers to build, train, and deploy machine learning models on the cloud.</p>

<p>SageMaker provides an integrated Jupyter notebook environment for easy, seamless access to your data sources for analysis and exploration, as well as optimized ML algorithms that can run effectively against a huge amount of data in a distributed environment. SageMaker helps to remove any obstacle that might prevent you from building ML solutions or cause a delay.</p>

<h1 id="how-sagemaker-works">How SageMaker Works</h1>

<p>SageMaker has a three-step process that simplifies machine learning modelling. Let’s take a look:</p>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/pasted-image-0.png" alt="amazon sagemaker diagram" />Figure 1: AWS SageMaker: Build, train, deploy</p>

<h2 id="build">Build</h2>

<p>SageMaker has a completely integrated development environment you can leverage to build production-ready ML models. With its hosted and easy-to-use Jupyter notebooks, you can visualize and explore data stored in Amazon S3. Besides, SageMaker also comes installed with tons of optimized ML algorithms.</p>

<h2 id="train">Train</h2>

<p>You can use SageMaker’s Experiments to organize, track, assess, and compare every iteration to ML models. You can train your ML model on a managed infrastructure by making a simple API call or via a single click. With the help of its automatic model tuning capabilities, you can select the best set of hyperparameters from your preferred algorithm.</p>

<h2 id="deploy">Deploy</h2>

<p>After model training and hyperparameter tuning, your next step is to deploy the ML model to generate predictions for batch data or new data. Once the model is deployed, it will be hosted on auto-scaling Amazon ML instances across several availability zones for high availability and performance.</p>

<h1 id="amazon-sagemaker-features">Amazon SageMaker Features</h1>

<p>SageMaker has many useful features that make building, training, and deployment of ML models easy for users. These include the following:</p>

<h2 id="inbuilt-machine-learning-algorithms">Inbuilt Machine Learning Algorithms</h2>

<p>Amazon SageMaker comes installed with many ML algorithms that you can use to train big datasets. These algorithms are always available and optimized for accuracy, scale, and speed.</p>

<p>SageMaker includes supervised ML algorithms such as logistic/linear regression and XGBoost. These algorithms are used to solve time series prediction or recommendation problems. SageMaker also includes support for unsupervised learning, such as principal component analysis and k-means. These unsupervised algorithms are used to address problems like clustering and customer segmentation.</p>

<h2 id="end-to-end-machine-learning-platform">End-to-End Machine Learning Platform</h2>

<p>An end-to-end machine learning platform is a platform designed to speed up modeling and deployment and ensure scalability and reliability in production. Amazon SageMaker supports the end-to-end lifecycle of ML applications, from data collection to model building, training, deployment, and scaling.</p>

<h2 id="zero-setup">Zero Setup</h2>

<p>Research shows that there’s an increasing demand for ML, and the future of ML is promising. Yet according to a 2018 US Census Bureau survey of nearly 600,000 businesses, <a href="https://www.wired.com/story/ai-why-not-more-businesses-use/">a mere 2.8% were using machine learning</a>, and only 8.9% were using any form of AI. This shows that many enterprises are facing a crucial knowledge gap—but the cloud offers a solution.</p>

<p>Amazon SageMaker lets you implement ML features without some of the operational overheads experienced with on-premise setup. SageMaker also provides the APIs and SDKs, freeing you from responsibility for setup, and you can embed ML functionalities on the go.</p>

<h2 id="pay-for-what-you-use">Pay for What You Use</h2>

<p>Using the cloud gives you cost-effective resourcing and increased agility. With AWS SageMaker, you pay only for resources that you use. When building, training, and deploying your ML models on SageMaker, you’ll be billed by the second, with no upfront commitments and no minimum fees. Pricing is broken down by on-demand ML storage, ML instances, and data processing.</p>

<p>The pay-as-you-use model is perfect for bursting ML workloads and helps you leverage the power and speed of GPUs for training without incurring the costs of hardware investment.</p>

<h2 id="flexible-model-training">Flexible Model Training</h2>

<p>ML model development can be an iterative, expensive, and complicated process. Also, model training jobs can take anywhere from minutes to hours or even days. With SageMaker, you can efficiently train and build ML models at scale. This means you can experiment with ML capabilities and services and automatically scale based on resource demands.</p>

<h1 id="training-your-first-model-on-sagemaker">Training Your First Model on Sagemaker</h1>

<p>On a cool Monday morning, you get an email from your boss saying, “Our focus for the next quarter is building an application that predicts whether a customer will sign up for a Certificate of Deposit (CD).” Our target customers are people who are looking to invest their capital. Your job is to build a machine learning model capable of predicting if a customer will enroll, given the features and the data from the marketing team. This is a typical end-to-end ML project that you can efficiently build and deploy on SageMaker. In this section, we’ll learn how to get started with SageMaker and to train and deploy ML models on it.</p>

<p>Note: To use the service, you’ll need to <a href="https://portal.aws.amazon.com/billing/signup#/paymentinformation">create an AWS account</a>.</p>

<p>Now let’s dive right in.</p>

<h2 id="step-1-create-a-notebook-instance">Step 1: Create a Notebook Instance</h2>

<ol>
  <li>
    <p>Log into the AWS Management console, type “SageMaker” in the search bar and select it.</p>
  </li>
  <li>
    <p>In the SageMaker console, click “Notebook instances,” then “Create notebook instance.”</p>
  </li>
</ol>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/09/image7.png" alt="img" /></p>

<ol>
  <li>Fill in the required information (like Notebook instance name or Notebook instance type), then click “Create notebook instance.”</li>
</ol>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/image2-1024x624.png" alt="img" /></p>

<ol>
  <li>When the server status changes from “Pending” to “Inservice,” click “Open Jupyter.”</li>
</ol>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/image3.png" alt="img" /></p>

<h2 id="step-2-load-visualize-and-prepare-data">Step 2: Load, Visualize, and Prepare Data</h2>

<ol>
  <li>Once Jupyter opens, click “New” and select “Conda Python 3.”</li>
</ol>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/image1.png" alt="img" /></p>

<ol>
  <li>Import the relevant libraries, define some environment variables in the notebook environment (as shown below), and run the cell.</li>
</ol>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">seaborn</span> <span class="k">as</span> <span class="n">sns</span>
<span class="kn">import</span> <span class="nn">boto3</span><span class="p">,</span> <span class="n">re</span><span class="p">,</span> <span class="n">sys</span><span class="p">,</span> <span class="n">math</span><span class="p">,</span> <span class="n">json</span><span class="p">,</span> <span class="n">os</span><span class="p">,</span> <span class="n">sagemaker</span><span class="p">,</span> <span class="n">urllib</span><span class="p">.</span><span class="n">request</span>
<span class="kn">from</span> <span class="nn">sagemaker</span> <span class="kn">import</span> <span class="n">get_execution_role</span>
<span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">Image</span>
<span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">display</span>
<span class="kn">from</span> <span class="nn">time</span> <span class="kn">import</span> <span class="n">gmtime</span><span class="p">,</span> <span class="n">strftime</span>
<span class="kn">from</span> <span class="nn">sagemaker.predictor</span> <span class="kn">import</span> <span class="n">csv_serializer</span>

<span class="c1"># Define the IAM role 
</span><span class="n">role</span> <span class="o">=</span> <span class="n">get_execution_role</span><span class="p">()</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="s">"sagemaker/DEMO-xgboost-dm"</span>
<span class="n">containers</span> <span class="o">=</span> <span class="p">{</span><span class="s">"eu-west-1"</span><span class="p">:</span> <span class="s">"685385470294.dkr.ecr.eu-west-1.amazonaws.com/xgboost:latest"</span><span class="p">,</span>
              <span class="s">"us-west-2"</span><span class="p">:</span> <span class="s">"433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest"</span><span class="p">,</span>
              <span class="s">"us-east-1"</span><span class="p">:</span> <span class="s">"811284229777.dkr.ecr.us-east-1.amazonaws.com/xgboost:latest"</span><span class="p">,</span>
              <span class="s">"us-east-2"</span><span class="p">:</span> <span class="s">"825641698319.dkr.ecr.us-east-2.amazonaws.com/xgboost:latest"</span>
              <span class="p">}</span>
<span class="c1"># every region has its XGBoost container 
</span><span class="n">my_region</span> <span class="o">=</span> <span class="n">boto3</span><span class="p">.</span><span class="n">session</span><span class="p">.</span><span class="n">Session</span><span class="p">().</span><span class="n">region_name</span>
<span class="c1"># set the region of the instance
</span><span class="k">print</span><span class="p">(</span><span class="s">"Success - the MySageMakerInstance is in the "</span> <span class="o">+</span> <span class="n">my_region</span> <span class="o">+</span> <span class="s">" region. You will use the "</span> <span class="o">+</span> <span class="n">containers</span><span class="p">[</span>
    <span class="n">my_region</span><span class="p">]</span> <span class="o">+</span> <span class="s">" container for your SageMaker endpoint."</span><span class="p">)</span>


</code></pre></div></div>

<ol>
  <li>Create an S3 bucket. The training data and model artifacts will be saved in the bucket. (In the screen capture below, the bucket name is “awsexperimentbucket1000.”)</li>
</ol>

<p>If the S3 bucket is created successfully, your code will run without any errors (as shown below).</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">bucket_name</span> <span class="o">=</span> <span class="s">'awsexperimentbucket1000'</span> <span class="c1"># you can always use any name as your bucket name
</span><span class="n">s3</span> <span class="o">=</span> <span class="n">boto3</span><span class="p">.</span><span class="n">resource</span><span class="p">(</span><span class="s">'s3'</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span> 
    <span class="k">if</span> <span class="n">my_region</span> <span class="o">==</span> <span class="s">'us-east-1'</span><span class="p">:</span>  
    <span class="n">s3</span><span class="p">.</span><span class="n">create_bucket</span><span class="p">(</span><span class="n">Bucket</span><span class="o">=</span><span class="n">bucket_name</span><span class="p">)</span> 
    <span class="k">else</span><span class="p">:</span>  
    <span class="n">s3</span><span class="p">.</span><span class="n">create_bucket</span><span class="p">(</span><span class="n">Bucket</span><span class="o">=</span><span class="n">bucket_name</span><span class="p">,</span> <span class="n">CreateBucketConfiguration</span><span class="o">=</span><span class="p">{</span> <span class="s">'LocationConstraint'</span><span class="p">:</span> <span class="n">my_region</span> <span class="p">})</span> <span class="k">print</span><span class="p">(</span><span class="s">'S3 bucket created successfully'</span><span class="p">)</span>

<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s">'S3 error: '</span><span class="p">,</span><span class="n">e</span><span class="p">)</span>

</code></pre></div></div>

<ol>
  <li>
    <p>Download the data, then load it into a dataframe.</p>

    <p>If everything is successful, your code will run without any errors (as shown     below).</p>
  </li>
</ol>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="s">'''this is the link to the datasethttps://d1.awsstatic.com/tmt/build-train-deploy-machine-learning-model-sagemaker/bank_clean.27f01fbbdf43271788427f3682996ae29ceca05d.csv", "bank_clean.csv"'''</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">urllib</span><span class="p">.</span><span class="n">request</span><span class="p">.</span><span class="n">urlretrieve</span> <span class="p">(</span><span class="s">"https://d1.awsstatic.com/tmt/build-train-deploy-machine-learning-model-sagemaker/bank_clean.27f01fbbdf43271788427f3682996ae29ceca05d.csv"</span><span class="p">,</span> <span class="s">"bank_clean.csv"</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">'Data downloaded successfully'</span><span class="p">)</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span><span class="k">print</span><span class="p">(</span><span class="s">'Data load error: '</span><span class="p">,</span><span class="n">e</span><span class="p">)</span>

<span class="k">try</span><span class="p">:</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s">'./bank_clean.csv'</span><span class="p">,</span><span class="n">index_col</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span><span class="k">print</span><span class="p">(</span><span class="s">'Data loaded into dataframe successfully'</span><span class="p">)</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s">'Data load error: '</span><span class="p">,</span><span class="n">e</span><span class="p">)</span>

</code></pre></div></div>

<p>You can use the Pandas head, shape, or columns function to explore the dataset.</p>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/image6-1.png" alt="img" /></p>

<ol>
  <li>Divide the dataset into a train set and a test set.</li>
</ol>

<p>The train set (75% of the data) will be used to build the model, while the test set (25% of the data) will be used to evaluate the model’s performance.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1">### the np.split() function splits the dataset into train and test 
</span>
<span class="n">settrain_df</span><span class="p">,</span> <span class="n">test_df</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="n">df</span><span class="p">.</span><span class="n">sample</span><span class="p">(</span><span class="n">frac</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">42</span><span class="p">),</span> <span class="p">[</span><span class="nb">int</span><span class="p">(</span><span class="mf">0.75</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">df</span><span class="p">))])</span>
<span class="k">print</span><span class="p">(</span><span class="n">train_df</span><span class="p">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">test_df</span><span class="p">.</span><span class="n">shape</span><span class="p">)</span>

</code></pre></div></div>

<h2 id="step-3-train-the-model">Step 3: Train the Model</h2>

<p>In this step, you’ll train your ML model with the train_df dataset.</p>

<ol>
  <li>To use SageMaker’s prebuilt XGBoost model, reformat the dataset structure and load the data from the AWS S3 bucket. See code below:</li>
</ol>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">pd</span><span class="p">.</span><span class="n">concat</span><span class="p">([</span><span class="n">train_df</span><span class="p">[</span><span class="s">'y_yes'</span><span class="p">],</span> <span class="n">train_df</span><span class="p">.</span><span class="n">drop</span><span class="p">([</span><span class="s">'y_no'</span><span class="p">,</span> <span class="s">'y_yes'</span><span class="p">],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)],</span>
          <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">).</span><span class="n">to_csv</span><span class="p">(</span><span class="s">'train.csv'</span><span class="p">,</span> <span class="n">index</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">header</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">boto3</span><span class="p">.</span><span class="n">Session</span><span class="p">().</span><span class="n">resource</span><span class="p">(</span><span class="s">'s3'</span><span class="p">).</span><span class="n">Bucket</span><span class="p">(</span><span class="n">bucket_name</span><span class="p">).</span><span class="n">Object</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">prefix</span><span class="p">,</span>
                                                                       <span class="s">'train/train.csv'</span><span class="p">)).</span><span class="n">upload_file</span><span class="p">(</span><span class="s">'train.csv'</span><span class="p">)</span>
<span class="n">s3_input_train</span> <span class="o">=</span> <span class="n">sagemaker</span><span class="p">.</span><span class="n">s3_input</span><span class="p">(</span><span class="n">s3_data</span><span class="o">=</span><span class="s">'s3://{}/{}/train'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">bucket_name</span><span class="p">,</span> <span class="n">prefix</span><span class="p">),</span>
                                    <span class="n">content_type</span><span class="o">=</span><span class="s">'csv'</span><span class="p">)</span>

</code></pre></div></div>

<ol>
  <li>Set up an Amazon SageMaker session, instantiate the estimator (XGBoost model), then define its parameters. See code below:</li>
</ol>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">xgbsession</span> <span class="o">=</span> <span class="n">sagemaker</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span>
<span class="n">xgboost_model</span> <span class="o">=</span> <span class="n">sagemaker</span><span class="p">.</span><span class="n">estimator</span><span class="p">.</span><span class="n">Estimator</span><span class="p">(</span><span class="n">containers</span><span class="p">[</span><span class="n">my_region</span><span class="p">],</span> <span class="n">role</span><span class="p">,</span> <span class="n">train_instance_count</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
                                              <span class="n">train_instance_type</span><span class="o">=</span><span class="s">'ml.m4.xlarge'</span><span class="p">,</span>
                                              <span class="n">output_path</span><span class="o">=</span><span class="s">'s3://{}/{}/output'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">bucket_name</span><span class="p">,</span> <span class="n">prefix</span><span class="p">),</span>
                                              <span class="n">sagemaker_session</span><span class="o">=</span><span class="n">session</span><span class="p">)</span>
<span class="n">xgboost_model</span><span class="p">.</span><span class="n">set_hyperparameters</span><span class="p">(</span><span class="n">max_depth</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">eta</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span> <span class="n">gamma</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">min_child_weight</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">subsample</span><span class="o">=</span><span class="mf">0.8</span><span class="p">,</span> <span class="n">silent</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
                                  <span class="n">objective</span><span class="o">=</span><span class="s">'binary:logistic'</span><span class="p">,</span> <span class="n">num_round</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
<span class="n">_predictor</span> <span class="o">=</span> <span class="n">xgboost_model</span><span class="p">.</span><span class="n">deploy</span><span class="p">(</span><span class="n">initial_instance_count</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">instance_type</span><span class="o">=</span><span class="s">'ml.m4.xlarge'</span><span class="p">)</span>

</code></pre></div></div>
<ol>
  <li>Now that the dataset has been loaded and you’ve set up the estimator, use gradient optimization to train the model on the “ml.m4.xlarge” instance. See code below:</li>
</ol>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">xgboost_model</span><span class="p">.</span><span class="n">fit</span><span class="p">({</span><span class="s">'train'</span><span class="p">:</span> <span class="n">s3_input_train</span><span class="p">})</span>
</code></pre></div></div>

<p><img src="https://iamondemand.com/wp-content/uploads/2020/08/image4.png" alt="img" /></p>

<p>Violà! The model training is successful.</p>

<h2 id="step-4-deploy-the-model">Step 4: Deploy the Model</h2>

<p>Deploy the model built to an endpoint. See code below:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">xgb_predictor</span> <span class="o">=</span> <span class="n">xgboost_model</span><span class="p">.</span><span class="n">deploy</span><span class="p">(</span><span class="n">initial_instance_count</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">instance_type</span><span class="o">=</span><span class="s">'ml.m4.xlarge'</span><span class="p">)</span>
</code></pre></div></div>

<p>Once the deployment is successful, the code above will run without any errors.</p>

<p>Now that you’ve deployed the model, you can use the test set to generate a set of predictions. See code below.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
  <span class="c1"># load the data into an array
</span><span class="n">test_df_array</span> <span class="o">=</span> <span class="n">test_df</span><span class="p">.</span><span class="n">drop</span><span class="p">([</span><span class="s">'y_no'</span><span class="p">,</span> <span class="s">'y_yes'</span><span class="p">],</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">).</span><span class="n">values</span> 
<span class="c1"># set the data type for an inference
</span><span class="n">xgb_predictor</span><span class="p">.</span><span class="n">content_type</span> <span class="o">=</span> <span class="s">'text/csv'</span>
<span class="c1"># set the serializer type
</span><span class="n">xgb_predictor</span><span class="p">.</span><span class="n">serializer</span> <span class="o">=</span> <span class="n">csv_serializer</span>
<span class="n">predictions</span> <span class="o">=</span> <span class="n">xgb_predictor</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">test_df_array</span><span class="p">).</span><span class="n">decode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">)</span>
<span class="c1"># predict!
</span><span class="n">predictions_array</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">fromstring</span><span class="p">(</span><span class="n">predictions</span><span class="p">[</span><span class="mi">1</span><span class="p">:],</span> <span class="n">sep</span><span class="o">=</span><span class="s">','</span><span class="p">)</span>
<span class="c1"># and turn the prediction into an array
</span><span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">predictions_array</span><span class="p">).</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="mi">0</span><span class="p">:</span> <span class="s">"predicted_values"</span><span class="p">})</span>

</code></pre></div></div>

<h2 id="conclusion">Conclusion</h2>

<p>Cloud computing and ML technologies are transforming the world we live in. Amazon SageMaker minimizes the need for maintenance and streamlines the machine learning pipelines while cutting costs. It’s a valuable tool for building end-to-end ML solutions at scale. I hope this article helps you get started.</p>]]></content><author><name>samuel</name></author><category term="MachineLearning" /><category term="SageMaker" /><category term="aws" /><summary type="html"><![CDATA[We’re in the fourth industrial age, where data is accumulated, collated, analyzed, and interpreted at lightning-fast speeds. Highly competitive organizations enhance their platforms with AI, but many companies are unable to process the information they collect and use it in a meaningful way because they have insufficient computing resources, storage, and even availability. That’s where cloud-native AI and machine learning (ML) like Amazon SageMaker come into play. In this post, you’ll learn about Amazon SageMaker—what it is and how to use it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://res.cloudinary.com/samueljames/image/upload/v1613297243/pasted-image-0.png" /><media:content medium="image" url="https://res.cloudinary.com/samueljames/image/upload/v1613297243/pasted-image-0.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">What People Don’t Understand About DevOps</title><link href="https://hubofco.de/devops/2021/01/26/what-people-get-wrong-about-devops/" rel="alternate" type="text/html" title="What People Don’t Understand About DevOps" /><published>2021-01-26T19:35:00+00:00</published><updated>2021-01-26T19:35:00+00:00</updated><id>https://hubofco.de/devops/2021/01/26/what-people-get-wrong-about-devops</id><content type="html" xml:base="https://hubofco.de/devops/2021/01/26/what-people-get-wrong-about-devops/"><![CDATA[<p>I was scrolling carelessly on Reddit a few days ago when I saw a thread on buzzwords that made me pause for a few seconds. In truth, buzzwords are powerful. They tend to express ideas and concepts in one word that are usually more popular than the core ideas they represent. For me, I like to see buzzwords as a way of packaging ideas with fancy words.</p>

<p>But this post is not about buzzwords but rather a word that many people misunderstand: DevOps.</p>

<p>I get emails from recruiters and companies looking for DevOps engineers all the time. This makes me wonder how different are people’s ideas of DevOps. Needless to say, the countless number of posts that had been written on technology to master to become a DevOps guru. If you’re looking to transition from a backend engineer to a DevOps engineer and vice versa, the internet got the answer for you. Today, no fewer than 7 million results are available on Google when you search the word “DevOps Engineer position.” Companies are waiting for that candidate to be their DevOps god.</p>

<p>At a time in my career, I thought DevOps was another tech role. Five years ago, I remember sitting at an interview and the engineer at the other side of the table popped the question. He said, “What do you understand by DevOps”?  Guess what? I answered it all wrong. I thought it was a role. Retrospectively, DevOps is still a word not many people comprehend.</p>

<p>You might ask, what is DevOps? To understand what DevOps is, we first need to understand what DevOps is not.</p>

<h2 id="its-not-a-role-or-a-title">It’s not a role or a title.</h2>

<p>DevOps is not a role you assume after graduating with a CS degree from a top university. Neither is a role you’re entitled to after building test automation infrastructure for a company’s product. DevOps is much more than a title conferred on one person.</p>

<h2 id="its-not-about-tools">It’s not about tools.</h2>

<p>While many tools can augment your DevOps initiatives, DevOps is not about tools. You don’t practice DevOps by installing or learning tools. DevOps is broader than the tools you use. If your organization does not fully grasp DevOps’ main concept, you will have a hard time benefiting from all the concepts DevOps has to offer.</p>

<h2 id="its-not-all-about-bringing-devs-and-ops-together-in-one-team">It’s not all about bringing Devs and Ops together in one team.</h2>

<p>DevOps is much more than combining Devs and Ops in the same team. While having Devs and Ops in the same team with the purpose of improving collaboration is one of the DevOps practices, it’s easy to fall into the trap that you’re practicing DevOps by merely bringing Devs and Ops together in one team.</p>

<h2 id="its-not-a-team">It’s not a team</h2>

<p>It’s common to see businesses wanting to adopt a DevOps mindset assume they practice DevOps by creating a team tagged “DevOps team” mainly composed of Ops engineers in charge of operations related tasks. Having a DevOps team does not mean you have adopted DevOps.</p>

<blockquote>
  <p>DevOps is a culture, a way of doing things. It transcends a single team.</p>
</blockquote>

<p>It must be embraced by the entire organization. You may have heroes who understand DevOps and have shown the willingness and strong leadership skills able to help drive DevOps transformation across the board in your organization. However, a dedicated team of Ops engineers is not the same thing as practicing DevOps.</p>

<h1 id="what-does-devops-mean">What does DevOps mean?</h1>

<p>The word DevOps came to be in 2009 after Patrick Debois coined it. It’s a word formed by combining “development” and “operations.” To understand the reasoning behind the word and how it came to be, you first need to understand the problem it tries to solve.</p>

<p>Historically, there is friction between development teams and operation teams, which creates a myriad of problems such as longer time to market, frequent outages, reduced quality, and continuous technical debt mounting. This friction is a result of two opposing goals that both teams must pursue simultaneously.</p>

<p>The development teams are tasked with shipping more features, making frequent changes, and responding to shifting and changing market demands. On the other hand, IT Operations teams are tasked with making service reliable, stable, and secure.</p>

<p>When you make frequent changes, the chances of disrupting service stability or breaking something are high. Since the IT Ops team has a goal of service reliability and stability, it often responds by putting measures that make it difficult to disrupt production services. It hinders the development team from responding quickly to changing markets. These measures often include bureaucratic approval policy, outright rejection, or long waiting time from my personal experience.</p>

<p>In the real world, apps that generate the most revenue are the ones that need frequent changes, releases, and need to be the most reliable and stable. These opposing goals make developer-operations conflict unavoidable, and hence something has to be done. Hence, the word DevOps was born.</p>

<h1 id="what-does-it-mean-to-practice-devops">What does it mean to practice DevOps?</h1>

<p>DevOps is about people as well culture with one goal of continuously delivering business value.</p>

<blockquote>
  <p><em>It’s a set of practices and patterns that turn human capital into a high-performance organization capital.</em></p>
</blockquote>

<p>It’s a way of working. It’s about building practices around Culture, Automation, Lean, Measurement, and Sharing. It’s hinged on a collaborative working relationship with IT Ops and Developers.</p>

<p><img src="https://cdn-images-1.medium.com/max/1600/0*uMTk7yzL1vvVR-XL.png" alt="img" />Image from https://itnext.io/do-not-put-devops-in-a-cage-3604a83821e1</p>

<p>Rather than having developments and IT Ops in separate teams, they are brought together in the same team for a fast feedback loop. The cross-functional team implements feature validation and ensures quality. It’s like my all-time favorite quote from the CTO of Amazon: you build it; you run it.</p>

<blockquote>
  <p>The cross-functional team builds it; they are responsible for running it.</p>
</blockquote>

<p>The team that practices DevOps works in a production-like environment where features are continuously rolled out to production. Instead of a one big bang deployment that happens once a month or year, they deploy several times a day.</p>

<p>Rather than working on a huge feature that takes months to deliver, features are broken into smaller deliverable chunks that provide business value. Short-lived PRs are opened and merged quickly once the feature is done, creating a fast feedback loop. With the fast feedback loop, they can see how their changes and actions negatively or positively impact the business outcome. This team has efficient and fast automated tests that run whenever changes are committed into version control, ensuring stability and security.</p>

<p>Rather than a culture that frowns at mistakes, mistakes are embraced and become a valuable learning tool. The team members are encouraged to take risks.</p>

<blockquote>
  <p>Rather than waiting for someone to tell them what to do, they take initiative and innovate.</p>
</blockquote>

<p>Instead of finger-pointing when a problem occurs, they openly discuss it and learn from it. The organization builds a culture where people are rewarded for taking risks. With the fear of making mistakes gone, they frequently experiment, thus fostering more innovation.</p>

<p>Some features are in production for weeks yet remain invisible to the end-users. They are turned on for the internal or some selected users allowing teams to test and verify that they meet business outcomes before turning on all users.</p>

<blockquote>
  <p>They conduct several A/B tests to learn user behavior and understand how each feature directly affects the business outcome.</p>
</blockquote>

<p>Because the organization hired smart minds, micro-management is non-existent. Each team is autonomous and makes decisions in the business’s interest. Instead of assigning tasks to teams, problems and goals are given. The business trusts the smart minds it hired to develop the best solution to solve the problem.</p>

<p>Because the business cares about achieving business goals, they create long-term teams around each goal responsible for meeting them. Developers are no longer reshuffled or re-assigned to new projects. Each team takes full responsibility and works independently to achieve its assigned goal.</p>

<p>Rather than thinking in terms of output, they think in terms of outcome. They don’t celebrate based on the number of tasks completed but rather on outcomes or business outcomes. Because the team cares about quality, resilience, and reliability, they deliberately inject failures into their production environment to uncover weaknesses and learn how their systems fail and remediate them.</p>

<h1 id="wrapping-up">Wrapping up</h1>

<p>We could go on and on what DevOps is; the bottom line is DevOps is not a title, a role, or a job function, and you can’t hire it. It’s more of a culture, and your entire organization needs to embrace DevOps for it to work. Beautifully sums up by Irma Harlann in his <a href="https://neonrocket.medium.com/devops-is-a-culture-not-a-role-be1bed149b0#:~:text=Mike Dilworth%2C Agile and DevOps,DevOps for it to work.&amp;text=DevOps is about continual learning and improvement rather than an end state.">post</a>, *“The whole company needs to be doing DevOps for it to work”</p>]]></content><author><name>Samuel</name></author><category term="Devops" /><category term="Devops" /><category term="software" /><summary type="html"><![CDATA[I was scrolling carelessly on Reddit a few days ago when I saw a thread on buzzwords that made me pause for a few seconds. In truth, buzzwords are powerful. They tend to express ideas and concepts in one word that are usually more popular than the core ideas they represent. For me, I like to see buzzwords as a way of packaging ideas with fancy words.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/images/logo.png" /><media:content medium="image" url="https://hubofco.de/images/logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Sales Demand Forecasting with Amazon Forecast</title><link href="https://hubofco.de/aws/forecast/2020/10/17/sales-demand-forecasting-with-amazon-forecast/" rel="alternate" type="text/html" title="Sales Demand Forecasting with Amazon Forecast" /><published>2020-10-17T07:36:00+00:00</published><updated>2020-10-17T07:36:00+00:00</updated><id>https://hubofco.de/aws/forecast/2020/10/17/sales-demand-forecasting-with-amazon-forecast</id><content type="html" xml:base="https://hubofco.de/aws/forecast/2020/10/17/sales-demand-forecasting-with-amazon-forecast/"><![CDATA[<p><img src="https://hubofco.de/uploads/forecasting.png" alt="" />
According to <a href="https://en.wikipedia.org/wiki/Forecasting">Wikipedia</a>, Forecasting is a process of making predictions of the future based on past and present data and, most commonly, by analyzing trends.</p>

<p>History is filled with people trying to predict the future by looking at trends and patterns. We don’t have the power to change the past, but we can control future outcomes if we know the future.</p>

<p>For businesses, the ability to predict the future and make informed decisions is critical to their survival.</p>

<p>The traditional method of generating forecasts from time series data often struggles to generate accurate predictions, especially when dealing with extensive data with irregular trends.</p>

<p>The ability to predict demand accurately is a critical need for retailers. They need to know how many inventory store units to have at hand to be at full stock for each product at a given time.</p>

<p>A low inventory level increases the risk of having a stock out, and a too-high inventory level increases the cost related to handling inventory.</p>

<p>Recent research shows that <a href="https://www.veeqo.com/inventory-management">43% of retailers</a> admitted that they have inventory problems. Retailers have constant pressure to meet market demand. This pressure is exacerbated by the fact that consumers have many options. A customer with unmet needs will not likely return to the same store. It’s estimated that <a href="https://www.retaildive.com/news/online-out-of-stocks-cost-22-billion-in-sales/528878/">online out-of-stocks problems cost retailers $22B in sales</a>.</p>

<p>The question is, can recent advances in AI help retailers forecast demand and keep customers happy at all times?</p>

<p>The rapid democratization of machine learning solutions (AutoML) is empowering individuals and businesses with little to no machine learning skills to solve complex problems that previously required strong expertise in AI. With AutoML, businesses can apply machine learning to complex problems without getting buried in the complexity associated with building, training, and serving ML models at scale.</p>

<p>One of such offerings is Amazon Forecast.</p>

<h1 id="amazon-forecast">Amazon Forecast</h1>

<p>Amazon forecast is a managed service that uses ML to deliver highly accurate predictions. It looks at historical data (time series data) to build forecasts. Amazon Forecast abstracts the complexity of training, building, or deploying a machine learning model. It automatically examines your data, engineers features, and generates a forecasting model.</p>

<p>Amazon forecast is based on the same technology used at Amazon.com. It has successful use cases from companies like <a href="https://www.moreretail.in/">More Retail</a>, <a href="https://www.anaplan.com/">Anaplan</a>, <a href="https://www.axiomtelecom.com/">AxiomTelecom</a>, <a href="https://www.omnys.com/">Omnys</a>, etc.</p>

<p>In this post, accompanied by a Jupyter Notebook, you’ll learn how to predict sales demand using Amazon Forecast. The workflow demonstrated in the notebook, from data preprocessing to training a predictor, can be fully automated.</p>

<h1 id="required-steps">Required Steps:</h1>

<ul>
  <li>Download Jupyter Notebook</li>
  <li>Pre-processed dataset and upload to an s3</li>
  <li>Import training data</li>
  <li>Create a predictor</li>
  <li>Create a forecast</li>
  <li>Retrieve forecast</li>
</ul>

<h1 id="download-jupyter-notebook">Download Jupyter Notebook</h1>

<p>To follow this tutorial, you’ll need to download my <a href="https://github.com/abiodunjames/Sales-demand-forecast/blob/master/Sales_demand_forecast.ipynb">Jupyter Notebook here</a> which you can run on Amazon Sagemaker Notebook Instance. Please ensure that your notebook instance has an IAM role with AmazonForecastFullAccess policy, AmazonSageMakerFullAccess policy, and S3 Policy with a read and write access to an S3 bucket that contains the preprocessed dataset.</p>

<h1 id="pre-processed-dataset">Pre-processed Dataset</h1>

<p>For this demo, I used the public <a href="https://www.kaggle.com/olistbr/brazilian-ecommerce">Olist public e-commerce dataset</a> to train a predictor. The dataset can be found in this <a href="https://github.com/abiodunjames/Predicting-ecommerce-sales-forecast">repository</a>.</p>

<p>It’s important to note that you can’t just feed any datasets into AWS Forecast. A dataset domain must be specified, and the dataset must conform to the structure of the domain. You can think of a dataset domain as a predefined dataset schema or format for a use case. For this example, the <a href="https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html">RETAIL Domain</a> is a perfect choice.</p>

<p>The <a href="https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html">RETAIL Domain</a> defines three required fields, the item_id (string), timestamp (timestamp) and the demand (float). This means we have to do some data pre-processing. The following code already.</p>

<ul>
  <li><em>item_id</em>: (string) — A unique identifier for the item or product you want to predict the demand.</li>
  <li><em>timestamp</em>: (timestamp)</li>
  <li><em>demand</em>: (float) — The number of sales for that item at the timestamp. This is also the target field for which Amazon Forecast generates a forecast.</li>
</ul>

<p>The dataset has to be pre-processed to conform to our domain: RETAIL domain.</p>

<p><img src="https://miro.medium.com/max/60/0*rnpkCdf3mce27fSX?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*rnpkCdf3mce27fSX" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/60/0*hFc4k5Ou1y60i5DM?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*hFc4k5Ou1y60i5DM" alt="Image for post" /></p>

<h1 id="import-training-dataset">Import Training Dataset</h1>

<p>To import the training dataset, we first need to create a forecast dataset. This is where we specify information about the dataset so that AWS Forecast understands how to consume the dataset. We can use Forecast <a href="https://docs.aws.amazon.com/forecast/latest/dg/API_CreateDataset.html"><em>CreateDataset</em></a> API to achieve this as follows:</p>

<p><img src="https://miro.medium.com/max/60/0*aHTcP-PO-FpEHRlJ?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*aHTcP-PO-FpEHRlJ" alt="Image for post" /></p>

<p>Once the import job succeeded, the next step is to create a forecast using the <a href="https://docs.aws.amazon.com/forecast/latest/dg/API_CreateDataset.html"><em>CreateForecast</em></a> API.</p>

<h1 id="create-a-predictor">Create a Predictor</h1>

<p>To create a predictor, we specify the predictor’s name, the dataset group and some other parameters. We set PerformAutoML to true which means Amazon Forecast will evaluate each algorithm and choose the one that minimizes the objective function.</p>

<p><img src="https://miro.medium.com/max/60/0*cSsuPgSCFFoVNaOw?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*cSsuPgSCFFoVNaOw" alt="Image for post" /></p>

<p>Note: The predictor will not be available for use until it’s inactive. You can check for this status using forecast <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/forecast.html#ForecastService.Client.describe_predictor">DescribePredictor</a> API.</p>

<h1 id="create-forecasts">Create Forecasts</h1>

<p>Once the predictor is active, we can create a forecast for each item used to train a predictor. In this example, I specified 3 quantiles (0.1, 0.5 and 0.9) per forecast.</p>

<p><img src="https://miro.medium.com/max/60/1*MnF61ZIRdljxhGXKSS4K9g.png?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/5816/1*MnF61ZIRdljxhGXKSS4K9g.png" alt="Image for post" /></p>

<h1 id="retrieve-forecasts">Retrieve Forecasts</h1>

<p>You can either retrieve a forecast through the console or by using the <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/forecast.html#ForecastService.Client.create_forecast_export_job">CreateForecastExportJob</a> API to export the complete forecast into an S3 bucket.</p>

<p><img src="https://miro.medium.com/max/60/0*FtHBSEGgJmeACtdt?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*FtHBSEGgJmeACtdt" alt="Image for post" /></p>

<p>Demand forecast for product <code class="language-plaintext highlighter-rouge">moveis_decoracao</code> from September 1st to September 17th</p>

<p><img src="https://miro.medium.com/max/60/0*QTukwVvF8Lnqn6H2?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/3200/0*QTukwVvF8Lnqn6H2" alt="Image for post" /></p>

<p>Demand forecast for product <code class="language-plaintext highlighter-rouge">pet_shop</code> from September 1st to September 17th</p>

<h1 id="conclusion">Conclusion</h1>

<p>We have seen one use-case of Amazon Forecast in retail and how businesses can apply it to save costs derived from keeping too high or too low inventory. I hope this sparks new ideas as you embark on building your own solution to solve demand problems in Retail.</p>

<p>There’re rooms for improvement, and if you’re looking to develop a fully automated and large scale forecasting solution, I encourage you to look at the <a href="https://aws.amazon.com/blogs/machine-learning/building-ai-powered-forecasting-automation-with-amazon-forecast-by-applying-mlops/">Forecast process by applying MLOps</a></p>

<p><a href="https://github.com/abiodunjames/Sales-demand-forecast/blob/master/Sales_demand_forecast.ipynb"><strong>Jupyter Notebook</strong></a></p>]]></content><author><name>samuel</name></author><category term="AWS" /><category term="Forecast" /><category term="retail" /><category term="ecommerce" /><summary type="html"><![CDATA[According to Wikipedia, Forecasting is a process of making predictions of the future based on past and present data and, most commonly, by analyzing trends.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://hubofco.de/uploads/forecasting.png" /><media:content medium="image" url="https://hubofco.de/uploads/forecasting.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How We Built a Serverless E-Commerce Website on AWS to Fight COVID-19</title><link href="https://hubofco.de/serverless/2020/09/12/how-we-built-a-serverless-e-commerce-website-on-aws-to-combat-covid-19/" rel="alternate" type="text/html" title="How We Built a Serverless E-Commerce Website on AWS to Fight COVID-19" /><published>2020-09-12T06:52:00+00:00</published><updated>2020-09-12T06:52:00+00:00</updated><id>https://hubofco.de/serverless/2020/09/12/how-we-built-a-serverless-e-commerce-website-on-aws-to-combat-covid-19</id><content type="html" xml:base="https://hubofco.de/serverless/2020/09/12/how-we-built-a-serverless-e-commerce-website-on-aws-to-combat-covid-19/"><![CDATA[<p><img src="https://miro.medium.com/max/1962/0*WcEW2tF0Rf7k21Kr" alt="Image for post" /></p>

<p>2020 turned out to be radically different from what everyone had expected. COVID-19 has impacted our lives in many ways. As the pandemic spread, one question that baffled us was: what can we do with technology to combat the virus?.</p>

<p>A few days later, we got the answer.</p>

<p>In a phone conversation in March 2020, <a href="https://medium.com/u/98b05140c1ad?source=post_page-----2b66155f9b08----------------------">Olalekan Elesin</a> informed me of an idea he thought about while doing his regular grocery shopping at <a href="https://www.dm.de/">DM Drogrie</a>. The idea is centered on a bracelet like your typical watch that dispenses Sanitizer. Before this time, Sanitizer comes packaged in bottles and cans of different sizes and kinds. Some are mounted on walls, and some are portable enough to be carried around in handbags. But not portable enough to be worn on the wrist.</p>

<p>Having spent years building scalable applications and data applications at various startups, we understood that every idea must be validated. We have to know it’s an idea that people want and are willing to pay for. As for us, this means doing some market research.</p>

<h1 id="problem-discovery-and-customer-validation"><strong>Problem Discovery and Customer Validation</strong></h1>

<p>We started out with one goal split into two hypotheses: Is this a big enough problem, and are customers willing to part with cash in exchange for the product. Validating such assumptions with software/digital products is relatively straightforward. One could build a simple landing page, explain the idea, and add a form — the lean startup way. But for physical products, this is quite different. Developing MVP could mean designing and 3D-printing hand bracelets.</p>

<p>After several iterations in coming up with the leanest and cheapest way to test, we arrived at one: <a href="https://medium.com/@elesin.olalekan/maintaining-hand-hygiene-with-new-sanitizer-bracelets-d52bc0e8f647">blogpost as MVP</a>.</p>

<p>In the blogpost, we embedded a pre-order Google form to collect some information about potential customers. The form included the color, price, payment method and quantity fields.</p>

<blockquote>
  <p>While doing market research, it’s essential to collect potential customer contact details like email addresses, etc. If your product resonates with people, potential customers will not hesitate to give their contact details. Customers who provide contact details at the pre-launch phase are usually the first to be converted when you go live.</p>
</blockquote>

<p>Before our product hit the market, we recorded over <strong>$100,000</strong> in pre-order bookings from matured markets such as the USA, Germany, Italy, France, and the United Kingdom through the medium post with zero ad budget. More than <strong>85%</strong> of customers were willing to pay online and have their <a href="http://sanitizerwristbands.com/">SanitizerWristbands</a> delivered to them. We validated our problem and customer hypotheses with these and other leading indicators: big enough pain, and customer willingness to pay. This informed our next decision to develop the first physical MVP.</p>

<h1 id="manufacturing-an-uncharted-territory"><strong>Manufacturing: An Uncharted Territory</strong></h1>

<p>Physical goods manufacturing was uncharted territory for us. It was not long before we realized it’s different from everyday app development. In manufacturing, you design and create a product and then replicate it repeatedly. In software, the product design is the product. This is not to say one can not apply certain software development principles to manufacturing. The concept of lean, which is popular in tech, originated from the Toyota production system.</p>

<p>With our first product design ready, we spoke to manufacturing companies to build the prototype. We learned quickly that our design was too cumbersome and would cost a substantial amount to produce.</p>

<p><img src="https://miro.medium.com/max/60/1*dE_ooPtDYnmOdA9y42mYuA.png?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/4768/1*dE_ooPtDYnmOdA9y42mYuA.png" alt="Image for post" /></p>

<p>Figure 1: An Earlier Iteration</p>

<p>We arrived at the simplest possible design that works through customer feedback and trying to cut down the production cost to match the amount customers are willing to pay for.</p>

<p>We were ready and it was time we put our store online.</p>

<h1 id="scalable-ecommerce-website-with-0-commitment"><strong>Scalable eCommerce Website With 0$ Commitment</strong></h1>

<p>There are many e-commerce platforms for online retailers that don’t require technical expertise. <a href="https://www.shopify.com/">Shopify</a> and <a href="https://squareup.com/us/en/ecommerce">Square</a> are popular choices. We wanted a platform that would allow us faster access to the market with little to zero cost commitment. Shopify and other popular platforms didn’t meet our cost requirements, so we built a custom solution in 2 days with 0$ commitment. It’s a static website hosted on s3 with a serverless cart and inventory capabilities provided by <a href="https://app.snipcart.com/register?utm_source=hubofcode&amp;utm_medium=referral&amp;utm_campaign=aws-post">Snipcart</a>.</p>

<p>The high-level architecture looks like this:</p>

<p><img src="https://miro.medium.com/max/60/0*cT0BYaOGzG7KTqOW?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/1446/0*cT0BYaOGzG7KTqOW" alt="Image for post" /></p>

<p>Figure 2: E-commerce Website Architecture</p>

<p>Right from the start, our goal was to automatically deploy changes made to our website whenever a PR is merged. We leveraged AWS CodeBuild and CodePipeline to achieve this and automatically deploy new changes to an S3 bucket.</p>

<p><img src="https://miro.medium.com/max/60/0*61hii4IAT51buwJw?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/1962/0*61hii4IAT51buwJw" alt="Image for post" /></p>

<p>Figure 3: CI/CD with AWS</p>

<p>We sent a campaign email to customers who had pre-ordered before we launched.</p>

<h1 id="automating-orders-fulfilment"><strong>Automating Orders Fulfilment</strong></h1>

<p>With orders coming in, there was yet one more problem we had to solve: automating orders fulfilment.</p>

<p>Our warehouse and logistics partners are in China. This implies we must find a way of notifying them every time an order is placed. For a few orders per day, sending an email with an attachment to our logistic partners suffices. As the number of orders per day grows, this becomes really boring and automating the process became a necessity. Could we automate it?</p>

<p>To automate fulfilment of orders, we created a serverless “order fulfilment addon” whose architecture looks like this:</p>

<p><img src="https://miro.medium.com/max/60/0*WcEW2tF0Rf7k21Kr?q=20" alt="Image for post" /></p>

<p><img src="https://miro.medium.com/max/1962/0*WcEW2tF0Rf7k21Kr" alt="Image for post" /></p>

<p>Figure 4: Automating Order Fulfilment (WIP)</p>

<h1 id="making-informed-decisions-with-user-behaviour"><strong>Making Informed Decisions with User Behaviour</strong></h1>

<p>Our conversion goal was to get potential customers to buy our product. To reach this goal, we needed to learn about the things that interrupt user flow from right from landing on the website to actually placing an order. For us, this means conducting several A/B, Multivariate tests weekly and implementing what gets us closer to our goal.</p>

<hr />

<h1 id="lessons-learned"><strong>Lessons Learned</strong></h1>

<p>While we’re yet to achieve a massive scale like the popular e-commerce platforms, we’ve learned some valuable lessons that could spark new ideas for people looking to start online businesses.</p>

<ul>
  <li>When it comes to manufacturing, you’ll save a sizeable amount of money by manufacturing from developing countries (like India, etc.) than from developed countries like the US, Germany, etc.</li>
  <li>If you’re in Germany, registering a business can take a long time. If you plan to get to market fast, ensure you factor in the time, it will take you to navigate the bureaucracy.</li>
  <li>Patent your ideas before doing market research. We started with market research. After we saw that the idea was valuable, we wanted to move on with patenting; unfortunately, it was too late. Someone already filed a patent</li>
  <li>China makes manufacturing look like a piece of cake.</li>
  <li>Data is king. “In God we trust, all others bring data” — Edwards Deming. It’s important to start with a data-driven mindset, no matter how small. Think of how to decompose big ideas into testable components and figure out the data required to validate along the way.</li>
  <li>Get your analytics right early! This is not referring to Google Analytics. Consider the metrics that are indicative of your success and instrument them from the get-go. Great resource — <a href="https://robsobers.com/9-marketing-stack-stepbystep-guide-archived/">$9 Marketing Stack: A Step-by-Step Guide</a>.</li>
  <li>Experiment like no one cares. For us, we were always experimenting, we have a minimum of 5 tests running weekly. A/B tests, multivariate tests, Redirect tests, etc. You can do this for free with Google Optimize.</li>
</ul>]]></content><author><name>samuel</name></author><category term="Serverless" /><category term="aws" /><category term="serverless" /><category term="product" /><category term="snipcart" /><category term="ecommerce" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://miro.medium.com/max/1962/0*WcEW2tF0Rf7k21Kr" /><media:content medium="image" url="https://miro.medium.com/max/1962/0*WcEW2tF0Rf7k21Kr" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How I Built an Intelligent Twitter Bot</title><link href="https://hubofco.de/machinelearning/nlp/2020/05/26/how-i-built-an-intelligent-twitter-bot/" rel="alternate" type="text/html" title="How I Built an Intelligent Twitter Bot" /><published>2020-05-26T06:46:00+00:00</published><updated>2020-05-26T06:46:00+00:00</updated><id>https://hubofco.de/machinelearning/nlp/2020/05/26/how-i-built-an-intelligent-twitter-bot</id><content type="html" xml:base="https://hubofco.de/machinelearning/nlp/2020/05/26/how-i-built-an-intelligent-twitter-bot/"><![CDATA[<p>Twitter users tweet <a href="https://www.omnicoreagency.com/twitter-statistics/">500 million tweets</a> per day. The volume of information going through Twitter per day makes it one of the best platforms to get information on any subject of interest. In this post, I’ll walk you through how I built a Twitter bot with a brain.</p>

<p>According to <a href="https://www.cnet.com/news/new-study-says-almost-15-percent-of-twitter-accounts-are-bots/">a report</a>, there are 48 million bots on Twitter — and <a href="https://twitter.com/hubofml">hubofml</a> happened to be one of these bots. <a href="https://twitter.com/hubofml">Hubofml</a> started as a simple bot written in Node.Js (on one Sunday evening) running on a free Heroku dyno that tracks specific hashtags like <em>machinelearning</em>, <em>computervision</em>, and retweet tweets containing those hashtags.</p>

<p>My goal was to use the bot to collate information on machine learning and re-broadcast to people interested in them — after all, some of the best posts I read on machine learning came from links shared on Twitter by the community.</p>

<p>I thought it would be cool to have a bot that tracks hashtags related to machine learning that I follow to stay informed.</p>

<p>Days after the bot was deployed, I started noticing some forms of abuse and spam like this:</p>

<p><img src="https://res.cloudinary.com/samueljames/image/upload/v1590478162/Screenshot_2020-05-26_at_09.28.26.png" alt="" /></p>

<p>People would write tweets unrelated to machine learning and hashtag “machinelearning,” and the bot would retweet it.</p>

<p>When we think of spam, it’s easy to see them as unsolicited emails one receives from an unknown person.</p>

<p>As odd as it may seem, spam is not limited to emails alone these days. Spammers now target everything you can think of. From your inbox to comments on social media posts, you’ll find spam lurking around.</p>

<blockquote>
  <p>Personally, I think spam is destructive to communication and undermines the goal of social media.</p>
</blockquote>

<p>For the rest of the post, I will focus on how I used text categorization to combat spam on Twitter.</p>

<p>Text categorization is the process of automatically assigning one or more predefined categories to a text document. It has a wide range of use cases like articles categorization, spam detection, intent detection, etc.</p>

<p>As I proceed, if you like to take a look at the Jupyter notebook I used for this task, you will find it on Google colab <a href="https://colab.research.google.com/drive/1cNGoYn-jk3y2hAz8JcZvXtvAtBu-6sgJ?usp=sharing">here</a> or in this <a href="https://github.com/abiodunjames/building-intelligent-twitter-bot-post">repository</a>.</p>

<h1 id="getting-twitter-datasets">Getting Twitter Datasets</h1>

<p>There are four ways to obtain Twitter public data. Justin Littman wrote an impressive <a href="https://gwu-libraries.github.io/sfm-ui/posts/2017-09-14-twitter-data">article</a> on the four approaches.</p>

<p>For this task, I used the free (T<a href="https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets">witter Standard</a>) APIs. I downloaded about 110k of tweets datasets from Twitter using a python library called Tweepy.</p>

<p>I annotated the datasets using two labels: “yes” and “no.” All spam tweets were labeled “yes” and non-spam tweets were labeled “no”.</p>

<p>Spam tweets (in this context) are tweets with no context to the field of machine learning. They are usually about politics, crime, religion, trolling, etc. I retrieved spam tweets based on randomly selected keywords from Wordnet as hashtags.</p>

<p>Non-spam (Ham) tweets are tweets related to machine learning, computer vision, NLP, etc. I collected non-spam tweets by tracking hashtags like <em>machinelearning</em>, <em>computervision</em>, <em>NLP</em> on twitter using the Tweepy library.</p>

<h1 id="data-exploration">Data Exploration</h1>

<p>My next step was to accrue some knowledge about the data and its nature after downloading the datasets. The datasets contained 99,989 tweets labeled as spam and 10,538 tweets labeled as non-spam.</p>

<p><img src="https://miro.medium.com/max/1692/1*jLl0zqCvD_zZ3MHocuQcVg.png" alt="img" /></p>

<p>Spam vs Non-spam</p>

<p>The data distribution between the two labels shows that the tweet datasets are highly unbalanced. Unbalanced datasets often have substantial effects on how machine learning models generalize. The model could learn that spam tweets are more predominant, making it natural to lean toward the predominant class during generalization.</p>

<p>I downsampled the majority class to arrive at an equal number of spam and non-spam (ham) tweets, 10,000 tweets in each class, accounting for a total of 20k tweets in the downsampled dataset.</p>

<p><img src="https://miro.medium.com/max/1436/1*LQpfkjPhxcXmwAre3xPHrg.png" alt="img" /></p>

<p>Spam vs Non-spam</p>

<p>To get further insight into the data, I plotted the top-10 most common words in the two classes. For the non-spam <em>tweets, ai, machinelearning, artificialintellgience, data, data science, biodata, python,</em> and <em>deep learning</em> were found to be most common, as shown in the figure below.</p>

<p><img src="https://miro.medium.com/max/1816/1*LybgoXg-lT5TDlwHr5xZ7w.png" alt="img" /></p>

<p>On the other hand, words like <em>good, lol, like, thanks, day, know,</em> and <em>sorry</em> were found to be most common in spam tweets.</p>

<p><img src="https://miro.medium.com/max/2060/1*QAmrM8qwEVQHL7yAV9DP5A.png" alt="img" /></p>

<h1 id="data-preprocessing">Data Preprocessing</h1>

<p>Like text documents, tweets are not exempted from noises. This is, as a result, no formal way of representing tweets. Tweets contain certain special characteristics, such as usernames and retweets, signified by “RT,” links, emoticons, and unimaginable things.</p>

<p>It’s essential to clean them up before fitting a model. I applied several preprocessing steps like lowercase conversion, URLs removal, usernames removal, emoticons removal, tokenization, and stemming.</p>

<p>In addition, tweets also contain common words that are of little value to the context of the text. These words are known as stop words. Stop words are a set of commonly used words in any language. Hence the removal of stop words from tweets was crucial to focus on the important words in the tweet.</p>

<h1 id="feature-selections">Feature Selections</h1>

<h2 id="vocabulary-list">Vocabulary list</h2>

<p>A vocabulary list is a dictionary of words having each word in the dataset as a key and the number of times they occurred as value.</p>

<p>A vocabulary list could be:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{"machinelearn":100, "Trump":10, ""}
</code></pre></div></div>

<p>I built a vocabulary list of all words, excluding stopwords and words with less than 2 occurrences in the datasets.</p>

<h1 id="tweet-vectorization--padding">Tweet Vectorization &amp; Padding</h1>

<p>Machine learning models take vectors as input. In order to perform machine learning on text documents, we need to transform text documents into vector representations. This is known as text vectorization.</p>

<p>In my approach, I assigned a unique number to each word the vocabularies. Each tweet is encoded using the unique number assigned to the word. If a word could not be found in the dictionary, it’s automatically assigned a value of 1 — a value reserved for words that were not found in the vocabulary list.</p>

<p>Given the following vocabulary list:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{ "The": 5, "cat":1, "sat":3, "on":4, "mat":2, "in":6, "unk":0}
</code></pre></div></div>

<p>A sentence like “The cat sat on the mat in the morning” will be encoded as [5,1,3,4,5,2,6,5,0 ].</p>

<p>Finally, I used pad sequence to convert the variable-length sequence to the same size. This is crucial for the network to take in a batch of variable-length sequences.</p>

<h1 id="dataset-splitting">Dataset Splitting</h1>

<p>I split the datasets into training and test set. A training set is used in learning (fitting the model) and a test set for testing how the model generalizes on unseen data. 60% of the dataset was used for training, and 40% was used for testing.</p>

<h1 id="model-architecture">Model Architecture</h1>

<p>In the past few years, there have been many groundbreaking successes from applying RNNs to a variety of problems: speech recognition, language modeling, translation, image captioning, etc.</p>

<p>RNNs are great, but they suffer from short-term memory. If a tweet is too long, an RNN might have a hard time carrying information from earlier time steps to the current time step.</p>

<p>I used an LSTM network, a variant of RNN. LSTM avoids the long-term dependency problem by remembering information for long periods. They have internal features called gates that regulate the flow of information in and out of the cell state. There’s an interesting <a href="https://colah.github.io/posts/2015-08-Understanding-LSTMs/">article</a> on LSTM if you want to know more about how it works.</p>

<p>The network architecture consists of three layers, and it’s bidirectional. A bidirectional network allows inputs to be processed from the first to the last and from the last to the first. This ensures that the network is able to preserve information from both the past and future.</p>

<p>The first layer is the embedding layer, which transforms the input vectors into dense embedding vectors.</p>

<p>The second layer is the hidden layer, which takes in the dense vector and the previous hidden state to calculate the next hidden state. The final layer takes the final hidden states and feeds them through a fully connected layer, transforming it to the correct output dimension.</p>

<p>You can find more on the architecture in the Jupyter <a href="https://colab.research.google.com/drive/1cNGoYn-jk3y2hAz8JcZvXtvAtBu-6sgJ?usp=sharing">notebook</a>.</p>

<h1 id="deployment--inferencing">Deployment &amp; Inferencing</h1>

<p>After training the model, I deployed the model artifacts using <a href="https://flask.palletsprojects.com/en/1.1.x/">Flask</a> to <a href="https://devcenter.heroku.com/articles/free-dyno-hours">Heroku free dyno</a> for real-time inferencing.</p>

<p>From tracking tweets to retweeting or liking, the communication process looks like this:</p>

<p><img src="https://miro.medium.com/max/1542/1*FIFe-ZZ4BwJyENwS3-ZcWg.png" alt="img" /></p>

<ul>
  <li>Bot finds a new tweet on machine learning</li>
  <li>Bot calls the Heroku endpoint to predict the class of tweets.</li>
  <li>Non-spam tweets are retweeted/liked while spam tweets are ignored</li>
</ul>

<h1 id="conclusion">Conclusion</h1>

<p>While the model performed beyond my expectation, however, it’s far from being accurate. Occasionally low-quality tweets still manage to escape the spam filter. I believe that could be improved by using quality datasets.</p>

<p>The tweet datasets I used for this task were in tens of thousands; there were tweets with wrong labels. This can be improved by crowdsourcing the labeling task using services like <a href="https://www.mturk.com/">AWS Mechanical Turks</a>.</p>

<p>This is my first attempt at combating spam on social media, and I do hope to take this work further in the future.</p>

<blockquote>
  <p>Every month, I send out a newsletter containing lots of exciting stuff on data science, software engineering, and machine learning. Expect quick tips, links to interesting tutorials, opinions, and libraries. <a href="https://hubofml.substack.com/subscribe">Subscribe here</a></p>
</blockquote>]]></content><author><name>samuel</name></author><category term="MachineLearning" /><category term="nlp" /><category term="twitter" /><category term="socialmedia" /><category term="textcategorization" /><category term="machinelearning" /><summary type="html"><![CDATA[The volume of information going through Twitter per day makes it one of the best platforms to get information on any subject of interest. In this post, I’ll walk you through how I built a twitter bot with a brain — powered by machine learning]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://miro.medium.com/max/1400/1*9s1wWE8H1jSfn-S6n2C49w.png" /><media:content medium="image" url="https://miro.medium.com/max/1400/1*9s1wWE8H1jSfn-S6n2C49w.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>