API-ization of everything</a>. It seemed obvious then that software that talks to other software would be critical for building world-changing startups. What was less obvious then and more obvious now, is that those APIs would need to be connected to harness the full potential of everyday apps.</p><p>One of the best companies at connecting APIs is Zapier, which went through the YC Summer 2012 batch. Zapier, the leader in easy automation, makes it simple to automate workflows and move data across 5000+ apps. Setup takes less than six minutes and there’s not a single line of code. And with over 5,000 of the most popular B2B and consumer apps integrated, they’re already powering over 10 million integration possibilities.</p><p>I got a chance to sit down with Bryan Helmig (@<a href=https://www.ycombinator.com/"https://twitter.com/bryanhelmig?lang=en\%22>bryanhelmig), the CTO and founder of Zapier, to talk APIs, interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA). With this new API, they’re making it possible to plug integrations directly into your product, and it's optimized for LLMs. </p><hr><p><strong>Bryan, thanks for joining me and finding time to catch up. Given the launch of your new Natural Language (NLA) API, I’m sure there was some insight or trend you were seeing that guided your build. </strong></p><p>Bryan: Absolutely. AI apps have become the fastest-growing category of apps on Zapier’s platform… ever. We're seeing a huge demand from our users and partner ecosystem, to plug AI and large language models into their existing tools, workflows and automation. And Zapier is well positioned to help – 81 billion workflow tasks have already been created on our platform. </p><p>We actually started by prototyping LLM products into our own tech stack. We had two previous product experiments before NLA. The first was a fully chat-based Zap setup flow. With current-generation models, this often felt like playing \"20 questions\" with the model – not a great user experience. But it made us realize that other developers were likely facing the same challenges, and that Zapier could really deliver a seamless and simple developer experience in a way that no other company could. </p><p>From there, we focused on how to wrap up and simplify each individual API endpoint you might find across Zapier's 20k+ actions. We then allowed the model to call each one as a separate “tool”. That was the fundamental design principle we used internally, and it helped us to expose this as the new NLA API – for any developer to add integrations into that products or internal tools in 5-10 minutes. </p><p><strong>For a team that’s the expert in APIs, launching Zapier’s first public API is a big deal. What about LLMs made this project different from how you’ve previously approached APIs in the past?</strong></p><p>Prior to LLMs, we never felt like we could deliver the magical developer experience that we wanted to. Under the hood, Zapier wraps up a ton of complexity from our ecosystem – our platform handles around 20 types of API auths, custom fields, versioning and migrations, arbitrary payload sizes, binary data. You name it. Making a Zapier API would have meant passing along all that complexity to our end users. </p><p>But now, AI and LLMs bring an interesting inflection point for Zapier: The new Natural Language Actions API abstracts all that complexity away from devs. In fact, the API has only one required parameter: \"instructions\". NLA can also be used in the more \"classic\" way by calling it hard-coded parameters instead of natural language parsing, but the natural language capabilities make it especially useful for people building products<em> on top</em> of LLMs. Ultimately we are using LLMs to make APIs easier to use for both humans and other LLMs!</p><p><strong>And what are some of the exciting things you’re seeing people build with your APIs?</strong></p><p><a href=https://www.ycombinator.com/"https://zapier.com/blog/how-a-contractor-uses-ai-to-write-business-emails//">There's this amazing story</a> about a contractor with dyslexia who teamed up with a client of his who happened to be familiar with Zapier. They built a Zap with OpenAI’s GPT-3 to write better business emails. It totally transformed his communication and even helped him land a massive $200,000 contract! It’s those stories of AI and automation coming together to help individual people that makes me excited to be building on this technology today.</p><p>But, really, we’re just scratching the surface. We can’t predict what all the builders on our Zapier platform will create. I mean, when we launched multi-step Zaps 5 years ago, we set a \"sanity\" limit of 30 [workflow] steps. We thought that would clearly be enough for anybody. But in less than 24 hours, users were inundating us to raise the limit. And as we dug in deeper, and found these beautiful, mind-blowing and complex Zaps – things we couldn’t have ever imagined. With LLMs in the mix, we’re hoping we’ll enable that same level of creativity and power, and now from the developer community. </p><p><strong>So with all of the power that LLMs bring to the table, can you share what’s actually happening under the hood? How have you kept it simple? </strong></p><p>At its core, we leverage OpenAI’s GPT3.5 series to understand and process natural language instructions from the user, map it to a specific API call, and return the response from the API – all in a way that’s optimized for LLMs.</p><p>First, users give explicit permission to the model to access certain actions. We try to make this super fast and simple, to feel like an OAuth flow to the end user. When a user is setting this up, they’re able to see what the required fields are and either let the AI guess or manually specify the values. Then once in a developer’s platform, the only required field for the user is the natural language instruction. We take that instruction from a user and let the model figure out how to fill in the required fields. The model then constructs an API call. </p><p>Before we can send the results back, we also need to make it LLM and human-readable. Many APIs return really complex data in their API responses that would not only cause an LLM to go over its token limit but it confuses both the model and the user. (As an example, a Gmail API call returns over 10,000 tokens!). We've done work on our end to trim down the results to expose just the relevant pieces. The NLA API currently guarantees arbitrary API payloads will fit into 350 tokens or fewer. This makes it incredibly easy to use and build on the NLA API without worrying about the data input or output with the APIs.</p><p><strong>And for any aspiring API developer reading this – either looking to use your new APIs or even building their own – any tips from the guys who live and breathe APIs all day?</strong></p><p>Definitely. The big thing many APIs \"get wrong\" is being overly complex, overly unique, and overly hard to get started. You’ve talked about how Stripe and Lob have gotten payments and shipping right by simplifying complexity; we leaned on similar examples for inspiration. If you’re building an API, you should too.</p><p>We're definitely big fans of libraries like <a href=https://www.ycombinator.com/"https://django-ninja.rest-framework.com//">django-ninja or <a href=https://www.ycombinator.com/"https://fastapi.tiangolo.com//">FastAPI for creating compelling APIs with baked-in types and documentation. We're using that sort of technology under the hood as well, both for design consistency and for scalability. </p><p>In the development of our NLA, we've tried to be strict about not letting internal complexity filter down to end developers. NLA supports both OAuth and API keys for quickly getting started, and we have several off-the-shelf examples in the <a href=https://www.ycombinator.com/"https://nla.zapier.com/api/v1/dynamic/docs/">API documentation</a>, including a published <a href=https://www.ycombinator.com/"https://blog.langchain.dev/langchain-zapier-nla//">LangChain integration</a>.</p><p>If you want to get started, any developer can <a href=https://www.ycombinator.com/"https://nla.zapier.com/get-started//">create an API key right away</a>. We’re excited to see what you can imagine, and please share – tag me on Twitter (<a href=https://www.ycombinator.com/"https://twitter.com/bryanhelmig?lang=en\%22>@bryanhelmig) and show me what you’ve got. And even better, I’d love feedback on what we’ve built, and we’re here to answer questions. And if you’re an API geek like the rest of us at Zapier… <a href=https://www.ycombinator.com/"https://zapier.com/jobs/">we’re hiring</a>.</p>","comment_id":"641a393150615b000151d84b","feature_image":"/blog/content/images/2023/03/og-NLA_tm3lap.png","featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2023-03-21T16:09:37.000-07:00","updated_at":"2023-03-22T08:59:00.000-07:00","published_at":"2023-03-22T08:59:00.000-07:00","custom_excerpt":"We sat down with Bryan Helmig, the CTO and founder of Zapier, to talk APIs and interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA).","codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a710d2","name":"Garry Tan","slug":"garry","profile_image":"/blog/content/images/2023/03/Instagram-Image-Template--Square---21-.png","cover_image":null,"bio":"Garry is the President & CEO of Y Combinator. Previously, he was the co-founder & Managing Partner of Initialized Capital. Before that, he co-founded Posterous (YC S08) which was acquired by Twitter.","website":null,"location":null,"facebook":null,"twitter":"@garrytan","meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/garry/"}],"tags":[{"id":"62b9edfe063d2d0001f0fc58","name":"#442","slug":"hash-442","description":null,"feature_image":null,"visibility":"internal","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/404/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"},{"id":"61fe29efc7139e0001a71175","name":"Interview","slug":"interview","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/interview/"}],"primary_author":{"id":"61fe29e3c7139e0001a710d2","name":"Garry Tan","slug":"garry","profile_image":"https://ghost.prod.ycinside.com/content/images/2023/03/Instagram-Image-Template--Square---21-.png","cover_image":null,"bio":"Garry is the President & CEO of Y Combinator. Previously, he was the co-founder & Managing Partner of Initialized Capital. Before that, he co-founded Posterous (YC S08) which was acquired by Twitter.","website":null,"location":null,"facebook":null,"twitter":"@garrytan","meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/garry/"},"primary_tag":null,"url":"https://ghost.prod.ycinside.com/building-apis-for-ai-an-interview-with-zapiers-bryan-helmig/","excerpt":"Nearly ten years ago I wrote about the API-ization of everything. It seemed obvious then that software that talks to other software would be critical for building world-changing startups. What was less obvious then and more obvious now, is that those APIs would need to be connected to harness the full potential of everyday apps.","reading_time":5,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"6357f9044557ad0001018040","uuid":"b73507ea-8de6-4799-8305-1554bd33437c","title":"How to maintain engineering velocity as you scale","slug":"how-to-maintain-engineering-velocity-as-you-scale","html":"<p>Engineering is typically the function that grows fastest at a scaling startup. It requires a lot of attention to make sure the pace of execution does not slow and cultural issues do not emerge as you scale.</p><p>We’ve learned a lot about pace of execution in the past five years at Faire. When we launched in 2017, we were a team of five engineers. From the beginning, we built a simple but solid foundation that allowed us to maintain both velocity and quality. When we found product-market fit later that year and started bringing on lots of new customers, instead of spending engineering resources on re-architecturing our platform to scale, we were able to double down on product engineering to accelerate the growth. In this post, we discuss the guiding principles that allowed us to maintain our engineering velocity as we scaled.</p><h2 id=\"four-guiding-principles-to-maintaining-velocity\">Four guiding principles to maintaining velocity</h2><p>Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four important elements:</p><ol><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#1-hire-the-best-engineers\">Hiring the best engineers</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#2-build-a-solid-long-term-foundation-from-day-one\">Building solid long-term foundations from day one</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#3-track-engineering-metrics-to-drive-decision-making\">Tracking metrics to guide decision-making</a></li><li><a href=https://www.ycombinator.com/"https://www.ycombinator.com/blog/how-to-maintain-engineering-velocity-as-you-scale/#4-keep-teams-small-and-independent\">Keeping teams small and independent</a></li></ol><h2 id=\"1-hire-the-best-engineers\">1. Hire the best engineers</h2><p>You want to hire the best early team that you can, as they’re going to be the people helping you scale and maintain velocity. And good people follow good people, helping you grow your team down the road.</p><p>This sounds obvious, but it’s tempting to get people in seats fast because you have a truckload of priorities and you’re often the only one doing engineering recruiting in those early years. What makes this even harder is you often have to play the long game to get the best engineers signed on. Your job is to build a case for why your company is <em>the</em> opportunity for them. </p><p>We had a few amazing engineers in mind we wanted to hire early on. I spent over a year doing coffee meetings with some of them. I used these meetings to get advice, but more importantly I was always giving them updates on our progress, vision, fundraising, and product releases. That created FOMO which eventually got them so excited about what was happening at Faire that they signed up for the ride.</p><p>While recruiting, I looked for key competencies that I thought were vital for our engineering team to be successful as we scaled. These were:</p><h3 id=\"a-experts-at-our-core-technology\">a. Experts at our core technology</h3><p>In early stages, you need to move extremely fast and you cannot afford to make mistakes. We wanted the best engineers who had previously built the components we needed so they knew where mistakes could happen, what to avoid, what to focus on, and more. For example, we built a complex payments infrastructure in a couple of weeks. That included integrating with multiple payment processors in order to charge debit/credit cards, process partial refunds, async retries, voiding canceled transactions, and linking bank accounts for ACH payouts. We had built similar infrastructure for the Cash App at Square and that experience allowed us to move extremely quickly while avoiding pitfalls.</p><h3 id=\"b-focused-on-delivering-value-to-customers\">b. Focused on delivering value to customers</h3><p>Faire’s mission is to empower entrepreneurs to chase their dreams. When hiring engineers, we looked for people who were amazing technically but also understood our business, were customer focused, were passionate about entrepreneurship—and understood how they needed to work. That is, they understood how to use technology to add value to customers and product, quickly and with quality. To test for this, I would ask questions like: “Give me examples of how you or your team impacted the<em> </em>business.” Their answers would show how well they understood their current company’s business and how engineering can impact customers and change a company’s top-line numbers.</p><p>I also learned a lot when I let them ask questions about Faire. I love when engineering candidates ask questions about how our business works, how we make money, what our market size is, etc. If they don't ask these kinds of questions, I ask them things like: “Do you understand how Faire works?” “Why is Faire good for retailers?” “How would you sell Faire to a brand?” After asking questions like these a few times, you’ll see patterns and be able to quickly identify engineers who are business-minded and customer-focused.</p><p>Another benefit of hiring customer-focused engineers is that it’s much easier to shut down projects, start new ones, and move people around, because everyone is focused on delivering value for the customer and not wedded to the products they helped build. During COVID, our customers saw enormous change, with in-person trade shows getting canceled and lockdowns impacting in-person foot traffic. We had to adapt quickly, which required us to stop certain initiatives and move our product and engineering teams to launch new ones, such as our own version of <a href=https://www.ycombinator.com/"https://blog.faire.com/thestorefront/introducing-faire-summer-market-our-first-online-trade-show-event//">online trade shows</a>.</p><h3 id=\"c-grit\">c. Grit</h3><p>When we first started, we couldn’t afford to build the most beautiful piece of engineering work. We had to be fast and agile. This is critical when you are pre-product-market fit. Our CEO Max and a few early employees would go to trade shows to present our product to customers, understand their needs, and learn what resonated with them. Max would call us with new ideas several times a day. It was paramount that our engineers were <a href=https://www.ycombinator.com/"https://angeladuckworth.com/grit-book//">gritty and able to quickly make changes to the product. Over the three or four days of a trade show, our team deployed changes nonstop to the platform. We experimented with offerings like:</p><ul><li>Free shipping on first orders</li><li>Buy now, pay later</li><li>Buy from a brand and get $100 off when you re-order from the same brand</li><li>Free returns</li></ul><p>By trying different value propositions in a short time, our engineering team helped us figure out what was most valuable to our customers. That was how we found strong product-market fit within six months of starting the company.</p><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/CrRDf25EV8if-oP6rfEnSYeA_ttfKsayeQoM61gMOYFODZvpYsId0z2Y5RQ8z5xH4zt8UQaPBOwe1xus8oaqKQW1zxqNxz_ss9LHTpWyCc6tWsyJUm6_g6lVUtb6PkHluwNcqIU9MN3silgCLqtNHO2S8RkPcQCHBYiVPhK9Fteoiq_w9dZJqaxTqA/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>Our trade show storefront back when we were called Indigo Fair.</em></p><h2 id=\"2-build-a-solid-long-term-foundation-from-day-one\">2. Build a solid long-term foundation from day one</h2><p>The number one impediment to engineering velocity at scale is a lack of solid, consistent foundation. A simple but solid foundation will allow your team to keep building on top of it instead of having to throw away or re-architecture your base when hypergrowth starts.</p><p>To create a solid long-term foundation, you first need to get clear on what practices you believe are important for your engineering team to scale. For example, I remember speaking with senior engineers at other startups who were surprised we were writing tests and doing code reviews and that we had a code style guide from the very early days. But we couldn’t have operated well without these processes. When we started to grow fast and add lots of engineers, we were able to keep over 95% of the team focused on building features and adding value to our customers, increasing our growth. </p><p>Once you know what long-term foundations you want to build, you need to write it down. We were intentional about this from day one and documented it in our <a href=https://www.ycombinator.com/"https://craft.faire.com/handbook-89f166841ec9/">engineering handbook</a>. Today, every engineer is onboarded using this handbook.</p><p>The four foundational elements we decided on were:</p><h3 id=\"a-being-data-driven\">a. Being data-driven</h3><p>The most important thing is to build your data muscle early. We started doing this at 10 customers. At the time, the data wasn’t particularly useful; the more important thing was to start to collect it. At some point, you’ll need data to drive product decision-making. The longer you wait, the harder it is to embed into your team.</p><p>Here’s what I recommend you start doing as early as possible:</p><ul><li>Set up data pipelines that feed into a data warehouse.</li><li>Start collecting data on how people are using your product. As you add features and iterate, record how those changes are impacting user interactions. All of this should go into a data warehouse that is updated within minutes and made available to your team. As your product gets increasingly complex, it will become more and more important to use data to validate your intuition.</li><li>We use Redshift to store data. As user events are happening, our relational database (MySQL) replicates them in Redshift. Within minutes, the data is available for queries and reports.</li><li>Train your team to use experimentation frameworks.</li><li>Make it part of the product development process. The goal is to transform your intuition into a statistically testable statement. A good place to start is to establish principles and high-level steps for your team to follow when they run experiments. We’ve set principles around when to run experiments vs. when not to, that running rigorous experiments should be the default (and when it isn’t), and when to stop an experiment earlier than expected. We also have teams log experiments in a Notion dashboard.</li><li>The initial focus should be on what impact you think a feature will have and how to measure that change. As you’re scoping a feature, ask questions like: How are we going to validate that this feature is achieving intended goals? What events/data do we need to collect to support that? What reports are we going to build? Over time, these core principles will expand.</li><li>The entire team should be thinking about this, not just the engineers or data team. We reinforced the importance of data fluency by pushing employees to learn SQL, so that they could run their own queries and experience the data firsthand.</li><li>It’ll take you multiple reps to get this right. We still miss steps and fail to collect the right data. The sooner you get your team doing this, the easier it will be to teach it to new people and become better at it as an organization.</li></ul><h3 id=\"b-our-choice-of-programming-language-and-database\">b. Our choice of programming language and database</h3><p>When choosing a language and database, pick something you know best that is also scalable long-term.<strong> </strong>If you choose a language you don’t know well because it seems easier or faster to get started, you won’t foresee pitfalls and you’ll have to learn as you go. This is expensive and time-consuming. We started with Java as our backend programming language and MySQL as our relational database. In the early days, we were building two to three features per week and it took us a couple of weeks to build the framework we needed around MySQL. This was a big tradeoff that paid dividends later on.</p><h3 id=\"c-writing-tests-from-day-one\">c. Writing tests from day one</h3><p>Many startups think they can move faster by not writing tests; it’s the opposite. Tests help you avoid bugs and prevent legacy code at scale. They aren’t just validating the code you are writing now. They should be used to enforce, validate, and document requirements. Good tests protect your code from future changes as your codebase grows and features are added or changed. They also catch problems early and help avoid production bugs, saving you time and money. Code without tests becomes legacy very fast. Within months after untested code is written, no one will remember the exact requirements, edge cases, constraints, etc. If you don’t have tests to enforce these things, new engineers will be afraid of changing the code in case they break something or change an expected behavior.<br><br>There are two reasons why tests break when a developer is making code changes:</p><ul><li>Requirements change. In this case, we expect tests to break and they should be updated to validate and enforce the new requirements.</li><li>Behavior changes unexpectedly. For example, a bug was introduced and the test alerted us early in the development process.</li></ul><p>Every language has tools to measure and keep track of test coverage. I highly recommend introducing them early to track how much of your code is protected by tests. You don’t need to have 100% code coverage, but you should make sure that critical paths, important logic, edge cases, etc. are well tested. <a href=https://www.ycombinator.com/"https://leanylabs.com/blog/good-unit-tests//">Here are tips for writing good tests</a>.</p><h3 id=\"d-doing-code-reviews\">d. Doing code reviews</h3><p>We started doing code reviews when we hired our first engineer. Having another engineer review your code changes helps ensure quality, prevents mistakes, and shares good patterns. In other words, it’s a great learning tool for new and experienced engineers. Through code reviews, you are teaching your engineers patterns: what to avoid, why to do something, the features of languages you should and shouldn’t use. </p><p>Along with this, you should have a coding style guide. Coding guides help enforce consistency and quality on your engineering team. It doesn’t have to be complex. We use a tool that formats our code so our style guide is automatically enforced before a change can be merged. This leads to higher code quality, especially when teams are collaborating and other people are reviewing code.</p><p>We switched from Java to Kotlin in 2019 and we have a comprehensive style guide that includes recommendations and rules for programming in Kotlin. For anything not explicitly specified in our guide, we ask that engineers follow <a href=https://www.ycombinator.com/"https://kotlinlang.org/docs/coding-conventions.html/">JetBrains’ coding conventions</a>.</p><p>These are the code review best practices we share internally:</p><ul><li>#bekind when doing a code review. Use positive phrasing where possible (\"there might be a better way\" instead of \"this is terrible\"; \"how about we name this X?\" instead of \"naming this Y is bad\"). It's easy to unintentionally come across as critical, especially if you have a remote team.</li><li>Don't block changes from being merged if the issues are minor (e.g., a request for variable name change, indentation fixes). Instead, make the ask verbally. Only block merging if the request contains potentially dangerous changes that could cause issues or if there is an easier/safer way to accomplish the same.</li><li>When doing a code review, ensure that the code adheres to your style guide. When giving feedback, refer to the relevant sections in the style guide.</li><li>If the code review is large, consider checking out the branch locally and inspecting the changes in IntelliJ (Git tab on the bottom). It’s easier to have all of the navigation tools at hand.</li></ul><h2 id=\"3-track-engineering-metrics-to-drive-decision-making\">3. Track engineering metrics to drive decision-making</h2><p>Tracking metrics is imperative to maintaining engineering velocity. Without clear metrics, Faire would be in the dark about how our team is performing and where we should focus our efforts. We would have to rely on intuition and assumptions to guide what we should be prioritizing. </p><p>Examples of metrics we started tracking early (at around 20 engineers) included:</p><ul><li><strong>Uptime.</strong> One of the first metrics we tracked was <a href=https://www.ycombinator.com/"https://docs.datadoghq.com/integrations/uptime//">uptime. We started measuring this because we were receiving anecdotal reports of site stability issues. Once we started tracking it, we confirmed the anecdotal evidence and dedicated a few engineers to resolve the issue.</li><li><strong>CI wait time.</strong> Another metric that was really important was CI wait time (i.e., time for the build system to build/test pull requests). We were receiving anecdotal reports of long CI wait times for developers, confirmed it with data, and fixed the issue.</li></ul><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/KiE8tjsqkFvtJFmyY_6-IinXuT1A6C4x6JBUSX9qb9nDHB9lurJZAlHocGDEi3Sx_HSHNuBxozMBljGOsNokrQIJ9Hk6ZolI39yQtKPz0yuAbue0G2weaKWXqD65_Gbal_LYuEC5TpPoGIdCGd0jflhy1yRQzuG-pxV1IePbh8LuEtvqehC1gHs5lw/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>This is a dashboard we created in the early days of Faire to track important engineering metrics. It was updated manually by collecting data from different sources. Today, we have more comprehensive dashboards that are fully automated.</em></p><p>Once our engineering team grew to 100+, our top-level metrics became more difficult to take action against. When metrics trended beyond concerning thresholds, we didn’t have a clear way to address them. Each team was busy with their own product roadmap, and it didn’t seem worthwhile to spin up new teams to address temporary needs. Additionally, many of the problems were large in scale and would have required a dedicated group of engineers. </p><p>We found that the best solution was to build <a href=https://www.ycombinator.com/"https://www.datadoghq.com/blog/the-power-of-tagged-metrics//">dimensions so that we could view metrics by team. Once we had metrics cut by team, we could set top-down expectations and priorities. We were happy to see that individual teams did a great job of taking ownership of and improving their metrics and, consequently, the company’s top-level metrics.</p><h4 id=\"an-example-transaction-run-duration\">An example: transaction run duration</h4><p>Coming out of our virtual trade show, <a href=https://www.ycombinator.com/"https://blog.faire.com/thestudio/faire-summer-market-2021-our-global-trade-show-event-is-coming-in-july//">Faire Summer Market</a>, we knew we needed significant investment in our database utilization. During the event, site usage pushed our database capacity to its limits and we realized we wouldn’t be able to handle similar events in the future.</p><p>In response, we created a metric of how long transactions were open every time our application interacted with the database. Each transaction was attributed to a specific team. We then had a visualization of the hottest areas of our application along with the teams responsible for those areas. We asked each team to set a goal during our planning process to reduce their database usage by 20% over a three-month period. The aggregate results were staggering. Six months later, before our next event—<a href=https://www.ycombinator.com/"https://blog.faire.com/thestorefront/announcing-faires-2022-winter-virtual-trade-show-events//">Faire Winter Market</a>—incoming traffic was 1.6x higher, but we were nowhere close to maxing out our database capacity. Now, each team is responsible for monitoring their database utilization and ensuring it doesn’t trend in the wrong direction.</p><h3 id=\"managing-metrics-with-kpi-scorecards\">Managing metrics with KPI scorecards</h3><p>We’re moving towards a model where each team maintains a set of key performance indicators (KPIs) that get published as a scorecard reflecting how successful the team is at maintaining its product areas and the parts of the tech stack it owns.</p><p>We’re starting with a top-level scorecard for the whole engineering team that tracks our highest-level KPIs (e.g., <a href=https://www.ycombinator.com/"https://docs.datadoghq.com/tracing/guide/configure_an_apdex_for_your_traces_with_datadog_apm//">Apdex, database utilization, CI wait time, severe bug escapes, flaky tests). Each team maintains a scorecard with its assigned top-level KPIs as well as domain-specific KPIs. As teams grow and split into sub-teams, the scorecards follow the same path recursively. Engineering leaders managing multiple teams use these scorecards to gauge the relative success of their teams and to better understand where they should be focusing their own time.</p><p>Scorecard generation should be as automated and as simple as possible so that it becomes a regular practice. If your process requires a lot of manual effort, you’re likely going to have trouble committing to it on a regular cadence. Many of our metrics start in DataDog; we use their API to extract relevant metrics and push them into Redshift and then visualize them in Mode reports.</p><p>As we’ve rolled this process out, we’ve identified criteria for what makes a great engineering KPI:</p><ul><li><strong>Can be measured and has a believable source of truth.</strong> If capturing and viewing KPIs is not an easy and repeatable task, it’s bound to stop happening. Invest in the infrastructure to reliably capture KPIs in a format that can be easily queried.</li><li><strong>Clearly ladders up to a top-level business metric.</strong> If there isn’t a clear connection to a top-level business metric, you’ll have a hard time convincing stakeholders to take action based on the data. For example, we’ve started tracking pager volume for our critical services: High pager volume contributes to tired and distracted engineers which leads to less code output, which leads to fewer features delivered, which ultimately means less customer value.</li><li><strong>Is independent of other KPIs.</strong> When viewing and sharing KPIs, give appropriate relative weight to each one depending on your priorities. If you’re showing two highly correlated KPIs (e.g., cycle time and PR throughput), then you’re not leaving room for something that’s less correlated (e.g., uptime). You might want to capture some correlated KPIs so that you can quickly diagnose a worrying trend, but you should present non-duplicative KPIs when crafting the overall scorecard that you share with stakeholders.</li><li><strong>Is normalized in a meaningful way.</strong> Looking at absolute numbers can be misleading in a high-growth environment, which makes it hard to compare performance across teams. For example, we initially tracked growth of overall infrastructure cost. The numbers more than doubled every year, which was concerning. When we later normalized this KPI by the amount of revenue a product was producing, we observed the KPI was flat over time. Now we have a clear KPI of “amount spent on infrastructure to generate $1 in revenue.” This resulted in us being comfortable with our rate of spend, whereas previously we were considering staffing a team to address growing infrastructure costs.</li></ul><p>We plan to keep investing in this area as we grow. KPIs allow us to work and build with confidence, knowing that we’re focusing on the right problems to continue serving our customers.</p><h2 id=\"4-keep-teams-small-and-independent\">4. Keep teams small and independent</h2><p>When we were a company of 25 employees, we had a single engineering team. Eventually, we split into two teams in order to prioritize multiple areas simultaneously and ship faster. When you split into multiple teams, things can break because people lose context. To navigate this, we developed a pod structure to ensure that every team was able to operate independently but with all the context and resources they needed. </p><p>When you first create a pod structure, here are some rules of thumb:</p><ul><li><strong>Pods should operate like small startups.</strong> Give them a mission, goals, and the resources they need. It’s up to them to figure out the strategy to achieve those goals. Pods at Faire typically do an in-person offsite to brainstorm ideas and come up with a prioritized roadmap and expected business results, which they then present for feedback and approval.</li><li><strong><strong><strong>Each pod should have no more than 8 to 10 employees. </strong></strong></strong>For us, pods generally include 5 to 7 engineers (including an engineering manager), a product manager, a designer, and a data scientist.</li><li><strong>Each pod should have a clear leader. </strong>We have an engineering manager and a product manager co-lead each pod. We designed it this way to give engineering a voice and more ownership in the planning process.</li><li><strong>Expect people to be members of multiple pods. </strong>While this isn’t ideal, there isn’t any other way to do it early on. Resources are constrained, and you need a combination of seasoned employees and new hires on each pod (otherwise they’ll lack context). Pick one or two people who have lots of context to seed the pod, then add new members. When we first did this, pods shared backend engineers, designers, and data analysts, and had their own product manager and frontend engineer.</li><li><strong>If you only have one product, assign a pod to each well-defined part of the product.</strong> If there’s not an obvious way to split up your product surface area, try to break it out into large features and assign a pod to each.</li><li><strong><strong><strong>Keep reporting lines and performance management within functional teams. </strong></strong></strong>This makes it easier to maintain:</li></ul><p>\t\t(1) Standardized tooling/processes across the engineering team and balanced \t\tleadership between functions</p><p>\t\t(2) Standardized career frameworks and performance calibration. We give our \t\tmanagers guidance and tools to make sure this is happening. For example, I \t\thave a spreadsheet for every manager that I expect them to update on a \t \t\tmonthly basis with a scorecard and brief summary of their direct reports’ \t\t \t\tperformance.</p><h3 id=\"how-we-stay-on-top-of-resource-allocation-census-and-horsepower\">How we stay on top of resource allocation: Census and Horsepower</h3><p>Our engineering priorities change often. We need to be able to move engineers around and create, merge, split, or sunset pods. In order to keep track of who is on which team—taking into account where that person is located, their skill set, tenure at the company, and more—we built a tool called Census.</p><p>Census is a real-time visualization of our team’s structure. It automatically updates with data from our ATS and HR system. The visual aspect is crucial and makes it easier for leadership to make decisions around resource allocation and pod changes as priorities shift. Alongside Census, we also built an algorithm to evaluate the “horsepower” of a pod. If horsepower is showing up as yellow or red, that pod either needs more senior engineers, has a disproportionate number of new employees, or both.</p><figure class=\"kg-card kg-image-card kg-card-hascaption\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/pJk7SUqsmeQLU_dYU3BrN5wMnzyHwVySmydpuiNbHgDddt_FzwwQzCQ_pQH75FX-InduoRGg5hSVhcfXZxRC3FztBZ3aF_2JnwIFMBOhjSey2cgRQEqs38oORhgZgrtwrmgO7CM-WSU_34oeyp15hdzHOrH_FAXTlFlJOt-A87J4Brce_ri3MER8RA/" class=\"kg-image\" alt loading=\"lazy\"><figcaption>.</figcaption></figure><p><em>Census.</em></p><figure class=\"kg-card kg-image-card\"><img src=https://www.ycombinator.com/"https://lh3.googleusercontent.com/N7btbx4GDkomhZp8wj0CMlTiGywqDffV6qCakK6aZEILScjRiIqjhwjV1q2AlT6bmrzU9vqo_pa1ggXn8j_C0CWsO4BEQdHoq5EcPfOhZwhe8tg1oMmmmDeYQXNrjF99WOdM5AKVTT5GAisZM_idtecOsjdXH_qQ2ezvEVRLltbkMfmk1j3qouwt7g/" class=\"kg-image\" alt loading=\"lazy\"></figure><p><em>Pods are colored either green, yellow, or red depending on their horsepower.</em><br><br>One of the most common questions that founders have is how to balance speed with everything else: product quality, architecture debt, team culture. Too often, startups stall out and sacrifice their early momentum in order to correct technical debt. In building Faire, we set out to both establish a unified foundation <em>and</em> continue shipping fast. These four guiding principles are how we did it, and I hope they help others do the same.</p>","comment_id":"6357f9044557ad0001018040","feature_image":"/blog/content/images/2022/10/BlogTwitter-Image-Template-2.jpeg","featured":true,"visibility":"public","email_recipient_filter":"none","created_at":"2022-10-25T07:56:04.000-07:00","updated_at":"2022-10-26T12:38:29.000-07:00","published_at":"2022-10-25T09:00:00.000-07:00","custom_excerpt":"Faire’s engineering team grew from five to over 100 engineers in three years. Throughout this growth, we were able to sustain our pace of engineering execution by adhering to four guiding principles.","codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a710d4","name":"Marcelo Cortes","slug":"marcelo-cortes","profile_image":"/blog/content/images/2022/10/Instagram-Image-Template--Square---7-.jpg","cover_image":null,"bio":"Marcelo Cortes is a co-founder and the CTO of Faire, an online wholesale marketplace connecting mostly small brands to independent, local retailers.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/marcelo-cortes/"}],"tags":[{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},{"id":"61fe29efc7139e0001a71181","name":"YC Continuity","slug":"yc-continuity","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/yc-continuity/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"},{"id":"61fe29efc7139e0001a71170","name":"Startups","slug":"startups","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/startups/"},{"id":"61fe29efc7139e0001a71158","name":"Leadership","slug":"leadership","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/leadership/"},{"id":"61fe29efc7139e0001a7114c","name":"Company Building","slug":"company-building","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/company-building/"},{"id":"62d804e33644180001d72a1f","name":"#1543","slug":"hash-1543","description":null,"feature_image":null,"visibility":"internal","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/404/"},{"id":"61fe29efc7139e0001a71155","name":"Growth","slug":"growth","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/growth/"}],"primary_author":{"id":"61fe29e3c7139e0001a710d4","name":"Marcelo Cortes","slug":"marcelo-cortes","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/10/Instagram-Image-Template--Square---7-.jpg","cover_image":null,"bio":"Marcelo Cortes is a co-founder and the CTO of Faire, an online wholesale marketplace connecting mostly small brands to independent, local retailers.","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/marcelo-cortes/"},"primary_tag":{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},"url":"https://ghost.prod.ycinside.com/how-to-maintain-engineering-velocity-as-you-scale/","excerpt":"Engineering is typically the function that grows fastest at a scaling startup. It requires a lot of attention to make sure the pace of execution does not slow and cultural issues do not emerge as you scale.","reading_time":16,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null},{"id":"61fe29f1c7139e0001a71977","uuid":"6ebcb7b7-a6c8-4fc4-9470-59892cab4ac9","title":"How to Use Responsive Images","slug":"how-to-use-responsive-images","html":"<!--kg-card-begin: html--><p>In the world of responsive web design one core, yet complicated, spec can net you substantial reductions in page size across the device spectrum. In this post I’ll demystify the complexity in the responsive images spec so you can use these powerful HTML attributes on your site. In part 2 you will learn how to build your own responsive image workflow, with a <a href=https://www.ycombinator.com/"https://github.com/webflow/responsive-images-demo/">code demo</a> that distills our responsive image stack into a single file. Also, we’ll dive into how we automate responsive images at scale processing millions of images at Webflow with AWS Lambda.</p>\n<p>Let’s dive in!</p>\n<h3>Responsive Images on Today’s Web</h3>\n<p>The <code><img></code> element has been around for a long time. Give it a <code>src</code> attribute and you’re well on your way. The spec adds two new attributes which the browser uses to make an image responsive.</p>\n<p>The new attributes are <code>sizes</code> and <code>srcset</code>. To put it simply: <code>sizes</code> tells the browser how big the <code><img></code> will render, and <code>srcset</code> gives the browser a list of image variants to choose from. The goal is to hint to the browser which variant in <code>srcset</code> to start downloading as soon as possible.</p>\n<p>The browser takes the <code>srcset</code> and <code>sizes</code> attributes you provide, combines them with the window width and screen density it already knows about and can start downloading the correct image variant right after the html is parsed— before anything is rendered; before css and javascript are even loaded. Modern browsers with pre-fetching enabled can start downloading the correct variant before you even navigate to the page. That’s a huge end-user performance increase!</p>\n<p>To see this in action, check out <a href=https://www.ycombinator.com/"https://webflow.com/feature/responsive-images/">https://webflow.com/feature/responsive-images and open the network inspector, to see the browser loading the correct variants.</p>\n<h1>Responsive Attributes</h1>\n<h3>How to Use Srcset</h3>\n<p><code>srcset</code> is just a list of image variants. You can specify a pixel density next to each variant in the list like this <code>srcset=”http://variant-1.jpg 2x, http://variant-2.jpg 1.5x”</code>. However this format only solves for hardware, serving better quality images on better quality displays, and does little for responsive design.</p>\n<p>What you really want is to list variants by pixel width so that when your site is loaded on a mobile layout and rendered at 500px wide, or on a desktop layout at 750px wide it’ll only download the variant it needs to render that layout. The width-based format looks like this <code>srcset=”http://variant-1jpg 500w, http://variant-2.jpg 750w, http://variant-3.jpg 1000w, http://variant-4.jpg 1500w”</code>. The <code>w</code> here represents pixel width of the actual image file that the corresponding url points to.</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/webflow1.png/">demo of image-to-image translation</a> (a <a href=https://www.ycombinator.com/"https://www.tensorflow.org//">Tensorflow port of <a href=https://www.ycombinator.com/"https://github.com/phillipi/pix2pix/">pix2pix by Isola et al.</a>). In case you missed it, search for <a href=https://www.ycombinator.com/"https://www.google.com/#q=edge2cat&*\">edge2cat</a>, and a whole new world of cat-infused artificial intelligence will be opened to you. The model is trained on cat images, and it can translate hand drawn cats to realistic images of cats! Here are a few of our personal favorite “edge” image to cat translations generated by Chris’s model, ranging from accurate to horrifying:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-1.png/">here Pachyderm has created a totally reusable and generic pipeline that takes care of all the training, pre-processing, etc. for you, so you can jump right into the fun parts! They utilize <a href=https://www.ycombinator.com/"https://medium.com/pachyderm-data/sustainable-machine-learning-workflows-8c617dd5506d#.mmwccp55c\">this machine learning pipeline template</a> (produced by the team at Pachyderm in collaboration with Chris) to show how easy it can be to deploy and manage image generation models (like those pictured above). Everything you need to run the reusable pipeline can be found <a href=https://www.ycombinator.com/"https://github.com/pachyderm/pachyderm/tree/master/doc/examples/ml/tensorflow/">here on Github</a>, and is described below.</p>\n<h1>The Model</h1>\n<p>Christopher Hesse’s image-to-image demos use a Tensorflow implementation of the Generative Adversarial Networks (or GANs) model presented in <a href=https://www.ycombinator.com/"https://arxiv.org/pdf/1611.07004v1.pdf/">this article</a>. Chris’s full Tensorflow implementation of this model can be found <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/">on Github</a> and includes documentation about how to perform training, testing, pre-processing of images, exporting of the models for serving, and more.</p>\n<p>In this post we will utilize Chris’s code in that repo along with a <a href=https://www.ycombinator.com/"https://github.com/dwhitena/pach-pix2pix/blob/master/Dockerfile/">Docker image</a> based on <a href=https://www.ycombinator.com/"https://hub.docker.com/r/affinelayer/pix2pix-tensorflow//">an image he created</a> to run the scripts (which you can also utilize in your experiments).</p>\n<h1>The Pipeline</h1>\n<p>To deploy and manage the model, we will execute it’s training, model export, pre-processing, and image generation in the reusable <a href=https://www.ycombinator.com/"http://pachyderm.io/pps.html/">Pachyderm pipeline</a> mentioned above. This will allow us to:</p>\n<ol>\n<li>Keep a rigorous historical record of exactly what models were used on what data to produce which results. </li>\n<li>Automatically update online ML models when training data or parameterization changes. </li>\n<li>Easily revert to other versions of an ML model when a new model is not performing or when “bad data” is introduced into a training data set.</li>\n</ol>\n<p>The general structure of our pipeline looks like this:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-3.png/">local installation of Pachyderm</a>. Alternatively, you can quickly spin up a real Pachyderm cluster in any one of the popular cloud providers. Check out the <a href=https://www.ycombinator.com/"http://docs.pachyderm.io/">Pachyderm docs</a> for more details on deployment.</p>\n<p>Once deploy, you will be able to use the Pachyderm’s <code>pachctl</code> CLI tool to create data repositories and start our deep learning pipeline.</p>\n<h1>Preparing the Training and Model Export Stages</h1>\n<p>First, let’s prepare our training and model export stages. Chris Hesse’s <code>pix2pix.py</code> script includes:</p>\n<ul>\n<li>A “train” mode that we will use to train our model on a set of paired images (such as facades paired with labels or edges paired with cats). This training will output a “checkpoint” representing a persisted state of the trained model. </li>\n<li>An “export” mode that will then allow us to create an exported version of the checkpointed model to use in our image generation.</li>\n</ul>\n<p>Thus, our “Model training and export” stage can be split into a training stage (called “checkpoint”) producing a model checkpoint and an export stage (called “model”) producing a persisted model used for image generation:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-4.png/">process.py script</a> to perform the resizing.</p>\n<p>To actually perform our image-to-image translation, we need to use a <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/blob/master/server/tools/process-local.py/">process_local.py script</a>. This script will take our pre-processed images and persisted model as input and output the generated, translated result:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-5.png/">another JSON specification</a>, <code>pre-processing_and_generation.json</code>, telling Pachyderm to: (i) run the <code>process.py</code> script in on the data in the “input_images” repository outputting to the “preprocess_images” repository, and (ii) run the <code>process_local.py</code> with the model in the “model” repository and the images in the “preprocess_images” repository as input. This can be done by running <code>pachctl create-pipeline -f pre-processing_and_generation.json</code>.</li>\n</ol>\n<h1>Putting it All Together, Generating Images</h1>\n<p>Now that we have created our input data repositories (“input_images” and “training”) and we have told Pachyderm about all of our processing stages, our production-ready deep learning pipeline will run automatically when we put data into “training” and “input_images.” It’s just works.</p>\n<p>Chris has provides a nice guide for preparing training sets <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow#datasets-and-trained-models\">here</a>. You can use cat images, dog images, buildings, or anything that might interest you. Be creative and show us what you come up with! When you have your training and input images ready, you can get them into Pachyderm using the <code>pachctl</code> CLI tool or one of the Pachyderm clients (discussed in more detail <a href=https://www.ycombinator.com/"http://docs.pachyderm.io/en/stable/deployment/inputing_your_data.html/">here)./n
For some inspiration, we ran Pachyderm’s pipeline with Google map images paired with satellite images to create a model that translates Google map screenshots into pictures resembling satellite images. Once we had our model trained, we could stream Google maps screenshots through into the pipeline to create translations like this:</p>\n<p><a href=https://www.ycombinator.com/"https://ycombinator.wpengine.com/wp-content/uploads/2017/04/edge2anything-6.png/">GitHub repo</a> to get the above reference pipeline specs along with even more detailed instructions. </li>\n<li>Join the <a href=https://www.ycombinator.com/"http://slack.pachyderm.io//">Pachyderm Slack team</a> to get help implementing your pipeline. </li>\n<li>Visit Chris’s <a href=https://www.ycombinator.com/"https://github.com/affinelayer/pix2pix-tensorflow/">GitHub repo</a> to learn more about the model implementation.</li>\n</ul>\n<!--kg-card-end: html-->","comment_id":"1099122","feature_image":null,"featured":false,"visibility":"public","email_recipient_filter":"none","created_at":"2017-04-14T01:25:58.000-07:00","updated_at":"2021-10-20T13:11:43.000-07:00","published_at":"2017-04-14T01:25:58.000-07:00","custom_excerpt":null,"codeinjection_head":null,"codeinjection_foot":null,"custom_template":null,"canonical_url":null,"authors":[{"id":"61fe29e3c7139e0001a71086","name":"Dan Whitenack","slug":"dan-whitenack","profile_image":"/blog/content/images/2022/02/dan-whitenack.jpg","cover_image":null,"bio":"Dan is a data scientist at Pachyderm (YC W15).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/dan-whitenack/"}],"tags":[{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},{"id":"61fe29efc7139e0001a71196","name":"Technical","slug":"technical","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/technical/"}],"primary_author":{"id":"61fe29e3c7139e0001a71086","name":"Dan Whitenack","slug":"dan-whitenack","profile_image":"https://ghost.prod.ycinside.com/content/images/2022/02/dan-whitenack.jpg","cover_image":null,"bio":"Dan is a data scientist at Pachyderm (YC W15).","website":null,"location":null,"facebook":null,"twitter":null,"meta_title":null,"meta_description":null,"url":"https://ghost.prod.ycinside.com/author/dan-whitenack/"},"primary_tag":{"id":"61fe29efc7139e0001a7116d","name":"Essay","slug":"essay","description":null,"feature_image":null,"visibility":"public","og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"codeinjection_head":null,"codeinjection_foot":null,"canonical_url":null,"accent_color":null,"url":"https://ghost.prod.ycinside.com/tag/essay/"},"url":"https://ghost.prod.ycinside.com/from-edge2cat-to-edge2anything-with-tensorflow/","excerpt":"Unless you have been hiding under a rock for the past few months, you have likely seen Christopher Hesse’s demo of image-to-image translation (a Tensorflow port of pix2pix by Isola et al.). In case you missed it, search for edge2cat, and a whole new world of cat-infused artificial intelligence will be opened to you. The model is trained on cat images, and it can translate hand drawn cats to realistic images of cats! Here are a few of our personal favorite “edge” image to cat translations generated by Chris’s model, ranging from accurate to horrifying:","reading_time":5,"access":true,"og_image":null,"og_title":null,"og_description":null,"twitter_image":null,"twitter_title":null,"twitter_description":null,"meta_title":null,"meta_description":null,"email_subject":null,"frontmatter":null,"feature_image_alt":null,"feature_image_caption":null}],"filter":"(Technical)","featured":null,"pagination":{"page":1,"limit":10,"pages":1,"total":4,"next":null,"prev":null}},"url":"/blog/tag/technical","version":"aa5cc48c512ec693ec60765a0397dfe59cf5da82","encryptHistory":false,"clearHistory":false,"rails_context":{"railsEnv":"production","inMailer":false,"i18nLocale":"en","i18nDefaultLocale":"en","href":"https://www.ycombinator.com/blog/tag/technical","location":"/blog/tag/technical","scheme":"https","host":"www.ycombinator.com","port":null,"pathname":"/blog/tag/technical","search":null,"httpAcceptLanguage":"en, *","applyBatchLong":"Summer 2025","applyBatchShort":"S2025","applyDeadlineShort":"May 13","ycdcRetroMode":true,"currentUser":null,"serverSide":true},"id":"ycdc_new/pages/BlogList-react-component-03b983e1-5bf8-4cb2-8be3-2f57cbda5c77","server_side":true}" data-reactroot="">
Unless you have been hiding under a rock for the past few months, you have likely seen Christopher Hesse’s demo of image-to-image translation (a Tensorflow port of pix2pix by Isola et al.). In case you missed it, search for edge2cat, and a whole new world of cat-infused artificial intelligence will be opened to you. The model is trained on cat images, and it can translate hand drawn cats to realistic images of cats! Here are a few of our personal favorite “edge” image to cat translations generated by Chris’s model, ranging from accurate to horrifying: