63°F

Aaron Parecki

  • Articles
  • Notes
  • Projects
  • Eddie Hinkle http://eddiehinkle.com/   •   Mar 30

    Viewing webmentions

    🎉
    Portland, Oregon, USA
    Thu, Mar 30, 2017 7:53am -07:00
  • Hello from Homebrew Website Club PDX! Thanks to @DreamHost for hosting us! 🍕🎉 #indieweb
    Portland, Oregon, USA
    11 likes 2 reposts 2 mentions
    #indieweb #hwc
    Wed, Mar 1, 2017 7:00pm -08:00
  • Aaron Parecki https://aaronparecki.com/   •   Feb 27
    I'm excited that @manton2 of http://micro.blog will be in town for Homebrew Website Club! You should come! 🍕 🎉 https://indieweb.org/events/2017-03-01-homebrew-website-club
    We have another out of town guest for Homebrew Website Club tonight! 🎉 https://twitter.com/cleverdevil/status/837019191546941445 (we also have 🍕 so you should come!)
    Portland, Oregon, USA
    #indieweb
    Wed, Mar 1, 2017 2:55pm -08:00
  • I'm excited that @manton2 of http://micro.blog will be in town for Homebrew Website Club! You should come! 🍕 🎉 https://indieweb.org/events/2017-03-01-homebrew-website-club
    Portland, Oregon, USA
    2 likes 1 repost 1 reply
    Mon, Feb 27, 2017 2:21pm -08:00
  • Day 60: Emoji Detector Library for PHP #100DaysOfIndieWeb

    Sat, Feb 18, 2017 2:38pm -08:00

    I wanted to find all emoji in a string, including info about them, for my next #100Days project. However I couldn't find a library that does this. The closes I found was iamcal's Emoji conversion library, which can replace emoji in a string with HTML tags, as well as the EmojiOne library which can replace emoji in a string with shortcodes.

    I started down a path of attempting to understand unicode encoding. A very helpful resource is this post from 2003 titled "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)". It's worth a read if you have to deal with user input at all.

    If you aren't familiar with the details of Emoji, Unicode and UTF-8 encoding, then what you probably don't realize is that an emoji character such as 👨‍👩‍👦‍👦 is actually composed of seven unicode characters. Each person is a separate character, and they are all connected with the "Zero-Width-Join" (ZWJ) character. This ends up being seven code points in total: 👨 [ZWJ] 👩 [ZWJ] 👦 [ZWJ] 👦. There are also skin tone modifiers which are their own character. So an emoji like 👍🏼 is actually two characters, the 👍 plus the skin-tone-3 modifier.

    To further complicate things, I've been talking about unicode code points, but it turns out these code points can be represented in any number of ways in a string depending on the string encoding. Typically we only need to worry about handling UTF-8 encoded strings now, so that's where I started. The UTF-8 encoding of a character like "A" is the same as the ASCII encoding of the character, using only one byte. However a character such as 👍 requires more than one byte to represent. This means actually finding meaningful emoji in a string is not as simple as reading byte by byte, and is not even as simple as reading UTF-8-character by character. 

    Thankfully, EmojiOne has done the hard work of finding the Emoji characters in a string. However their library doesn't have a way to return the Emoji found, it can only be used to replace them. I also didn't like the list of short names they use, I prefer the Slack names instead.

    What I ended up with was putting together the parsing regex from EmojiOne with the Emoji data from Slack's data set. I turned this into a library that returns the data I want to use. Here's how it works.

    Given an input string that may contain emoji characters, this function will find any emoji in the string and return an array with information about each character.

      $input = "Hello 👍🏼 World 👨‍👩‍👦‍👦";
      
    $emoji = Emoji\detect_emoji($input);
    • emoji - The emoji sequence found, as the original byte sequence. You can output this to show the original emoji.
    • short_name - The short name of the emoji, as defined by Slack's emoji data.
    • num_points - The number of unicode code points that this emoji is composed of.
    • points_hex - An array of each unicode code point that makes up this emoji. These are returned as hex strings. This will also include "invisible" characters such as the ZWJ character and skin tone modifiers.
    • hex_str - A list of all unicode code points in their hex form separated by hyphens. This string is present in the Slack emoji data array.
    • skin_tone - If a skin tone modifier was used in the emoji, this field indicates which skin tone, since the short_name will not include the skin tone.

    This package is now available on GitHub, and via Composer!

    composer require p3k/emoji-detector

    🎉👍

    Portland, Oregon
    5 mentions
    #100daysofindieweb #emoji #p3k #unicode
    Sat, Feb 18, 2017 2:38pm -08:00
  • https://waterpigs.co.uk/
    Happy birthday Barnaby! 🎉 🎂
    Portland, Oregon, USA
    Sat, Jan 7, 2017 5:08pm -08:00
  • Talks from October's #DonutJS are online! https://www.youtube.com/playlist?list=PLclEcT4yxER7Am62hEJBeRp-hN-yKvYRL 📽🎉🍩 @JeffLombardJr @theJeenaLee @malisas7 @caterinasworld @kopasetik
    Portland, Oregon, USA
    7 likes 3 reposts 1 mention
    #donutjs #video #livestream
    Thu, Oct 27, 2016 8:30am -07:00
  • Just published the videos from September's #DonutJS! 🍩🎉 https://www.youtube.com/playlist?list=PLRyLn6THA5wPegsjRJU_q2B7vze3yZRjZ
    Portland, Oregon, USA
    1 like 1 repost
    #DonutJS
    Fri, Oct 7, 2016 12:51pm -07:00
  • https://twitter.com/aaronpk/status/778003955569700864
    Well that was an adventure. Missed my connection in SLC because Timezones, but @Delta was nice enough to rebook me thru Paris! 🎉✈️👍🍻
    Salt Lake City, Utah, USA
    2 likes
    #travel
    Mon, Sep 19, 2016 5:18pm -07:00
  • #barbot in action pumping actual whiskey! 😀🍸🎉
    8 likes 1 reply
    #barbot
    Thu, Oct 29, 2015 8:55pm -07:00
next

Hi, I'm Aaron Parecki, co-founder of IndieWebCamp. I maintain oauth.net, write and consult about OAuth, and am the editor of several W3C specfications. I record videos for local conferences and help run a podcast studio in Portland.

I wrote 100 songs in 100 days! I've been tracking my location since 2008, and write down everything I eat and drink. I've spoken at conferences around the world about owning your data, OAuth, quantified self, and explained why R is a vowel.

Follow
  • Okta Developer Advocate
  • IndieWebCamp Founder
  • W3C Editor
  • Stream PDX Co-Founder
  • backpedal.tv

  • W7APK
  • ⭐️ Life Stack
  • All
  • Articles
  • Bookmarks
  • Notes
  • Photos
  • Replies
  • Reviews
  • Sleep
  • Travel
  • Contact
© 1999-2018 by Aaron Parecki. Powered by p3k. This site supports Webmention.
Except where otherwise noted, text content on this site is licensed under a Creative Commons Attribution 3.0 License.
IndieWebCamp Microformats Webmention W3C HTML5 Creative Commons