If you've ever wired a workflow into HubSpot, you know the pain: OAuth flows, token refresh, scopes, and an SDK you have to keep current with every HubSpot API change. This post walks through a different approach — one that combines a direct REST API for rich enrichment data, the Zapier SDK for the long tail of CRM writes, and an LLM as the semantic glue between them.
The workflow takes a company website, enriches it with Apollo, maps the returned industry string to a valid HubSpot enum via Claude Haiku, and upserts the record into HubSpot. Four steps, two of which run in parallel, and zero HubSpot auth code.
Why this shape?
HubSpot's industry field is a predefined dropdown, not free text. Apollo returns industries as freeform strings like "Internet Software & Services". Hardcoding a mapping table is brittle — HubSpot adds and removes choices, and Apollo's taxonomy is huge.
So the workflow fetches the current HubSpot enum list at runtime via the Zapier SDK and asks Claude Haiku to pick the closest semantic match. Swap HubSpot for Salesforce, Pipedrive, or any other Zapier-supported CRM by changing an app key — the mapping pattern still works.
The four steps
- Enrich with Apollo — Extract the domain from the input website and call the Apollo organization API for industry, employee count, funding stage, LinkedIn, location, and keywords.
-
Fetch HubSpot industries — Use the Zapier SDK's
listInputFieldChoiceshelper to pull the live set of valid HubSpot industry enum values. -
Map industry with Claude — Send Apollo's raw industry string and the full HubSpot enum list to
claude-haiku-4-5for the single best semantic match. Skipped when Apollo returns no industry. -
Upsert to HubSpot — Hand the enriched payload (with the mapped HubSpot industry) to the Zapier SDK's
search_or_writeaction.
Steps 1 and 2 run in parallel — they're independent calls, so there's no reason to serialize them.
File structure
zapier_hubspot_company_enrichment/
├── workflow.ts # Orchestration — parallel fetch, then map, then upsert
├── steps.ts # 4 steps: enrich, fetchIndustries, mapIndustry, upsert
├── types.ts # Zod schemas and TypeScript types
├── prompts/
│ └── map_hubspot_industry@v1.prompt # Haiku — semantic industry mapping
└── scenarios/
└── stripe.json # Test input: Stripe
Two shared clients do the I/O:
-
apollo.ts—enrichOrganizationenriches a company by domain via the Apollo API -
zapier.ts—createZapierClientinstantiates the Zapier SDK with credentials loaded from@outputai/credentials
workflow.ts
The Apollo enrichment and HubSpot industry fetch run in parallel with Promise.all. The LLM mapping step is conditional: if Apollo didn't return an industry, there's nothing to map, and the workflow passes undefined straight to the upsert. Every call is wrapped in step() so Output can trace, retry, and cache each one.
import { workflow } from '@outputai/core';
import {
enrichCompanyWithApollo,
fetchHubspotIndustries,
mapHubspotIndustry,
upsertHubspotCompany,
} from './steps.js';
import { workflowInputSchema, workflowOutputSchema } from './types.js';
export default workflow({
name: 'zapier_company_enrichment',
description:
'Enriches a company profile using Apollo via REST API and upserts the result into HubSpot via Zapier SDK',
inputSchema: workflowInputSchema,
outputSchema: workflowOutputSchema,
fn: async (input) => {
// Steps 1 + 2 -- Enrich via Apollo and fetch HubSpot industry choices in parallel
const [apolloData, { industries }] = await Promise.all([
enrichCompanyWithApollo({ website: input.website }),
fetchHubspotIndustries(),
]);
// Step 3 -- Map Apollo's raw industry string to a valid HubSpot enum (if any)
const hubspotIndustry = apolloData.industry
? (
await mapHubspotIndustry({
industry: apolloData.industry,
hubspotIndustries: industries,
})
).hubspotIndustry
: undefined;
// Step 4 -- Upsert the enriched + mapped company into HubSpot via Zapier
const { hubspotCompanyId, action } = await upsertHubspotCompany({
...apolloData,
hubspotIndustry,
});
return {
companyName: apolloData.name,
website: input.website,
hubspotCompanyId,
apolloData,
action,
};
},
});
steps.ts
The Apollo step extracts the domain from the input URL before calling the client — Apollo lookups are keyed by domain, not full URL. fetchHubspotIndustries paginates through the live enum values for the industry dropdown on HubSpot's company object. mapHubspotIndustry delegates to a prompt file rather than inlining the system/user text. The upsert step uses search_or_write with name as the search key, so Zapier either updates an existing HubSpot company by name or creates a new one. The connectionId identifies which user's HubSpot account the write goes to.
import { step } from '@outputai/core';
import { generateText, Output } from '@outputai/llm';
import { enrichOrganization } from '../../shared/clients/apollo.js';
import { createZapierClient } from '../../shared/clients/zapier.js';
import {
enrichCompanyInputSchema,
apolloCompanySchema,
fetchHubspotIndustriesOutputSchema,
mapHubspotIndustryInputSchema,
mapHubspotIndustryOutputSchema,
hubspotUpsertInputSchema,
hubspotUpsertOutputSchema,
zapierHubspotResponseSchema,
} from './types.js';
const HUBSPOT_CONNECTION_ID = 'your-hubspot-connection-id';
function extractDomain(website: string): string {
const url = new URL(website);
return url.hostname.replace(/^www\./, '');
}
// --- Step 1: Enrich company via Apollo REST API ---
export const enrichCompanyWithApollo = step({
name: 'enrich_company_with_apollo',
description: 'Enriches company data using Apollo REST API directly',
inputSchema: enrichCompanyInputSchema,
outputSchema: apolloCompanySchema,
fn: async ({ website }) => {
const domain = extractDomain(website);
const org = await enrichOrganization(domain);
if (!org?.name) {
throw new Error(`Apollo returned no data for domain: ${domain}`);
}
return {
name: org.name,
website: org.website_url ?? website,
domain: org.primary_domain ?? domain,
industry: org.industry ?? undefined,
employeeCount: org.estimated_num_employees ?? undefined,
estimatedRevenue: org.annual_revenue_printed ?? undefined,
description: org.short_description ?? undefined,
linkedinUrl: org.linkedin_url ?? undefined,
city: org.city ?? undefined,
country: org.country ?? undefined,
keywords: Array.isArray(org.keywords) ? org.keywords : undefined,
totalFunding: org.total_funding ?? undefined,
latestFundingRound: org.latest_funding_round_date ?? undefined,
fundingStage: org.latest_funding_stage ?? undefined,
};
},
});
// --- Step 2: Fetch HubSpot industry enum choices via the Zapier SDK ---
export const fetchHubspotIndustries = step({
name: 'fetch_hubspot_industries',
description: 'Fetches available HubSpot industry field choices via Zapier SDK',
outputSchema: fetchHubspotIndustriesOutputSchema,
fn: async () => {
const zapier = createZapierClient();
const industries: string[] = [];
for await (const item of zapier
.listInputFieldChoices({
appKey: 'hubspot',
actionType: 'search_or_write',
actionKey: 'company_crmSearch',
inputFieldKey: 'industry',
connectionId: HUBSPOT_CONNECTION_ID,
})
.items()) {
const value = item.value ?? item.key ?? item.label;
if (value) industries.push(value);
}
return { industries };
},
});
// --- Step 3: Map Apollo's industry string to a HubSpot enum via LLM ---
export const mapHubspotIndustry = step({
name: 'map_hubspot_industry',
description: 'Maps a raw industry string to a valid HubSpot industry enum value using an LLM',
inputSchema: mapHubspotIndustryInputSchema,
outputSchema: mapHubspotIndustryOutputSchema,
fn: async ({ industry, hubspotIndustries }) => {
const { output } = await generateText({
prompt: 'map_hubspot_industry@v1',
variables: {
industry,
hubspotIndustries: hubspotIndustries.join(', '),
},
output: Output.object({ schema: mapHubspotIndustryOutputSchema }),
});
return output;
},
});
// --- Step 4: Upsert into HubSpot via the Zapier SDK ---
export const upsertHubspotCompany = step({
name: 'upsert_hubspot_company',
description:
'Creates or updates a HubSpot company record using enriched Apollo data via Zapier SDK',
inputSchema: hubspotUpsertInputSchema,
outputSchema: hubspotUpsertOutputSchema,
fn: async (input) => {
const zapier = createZapierClient();
const domain = input.domain ?? extractDomain(input.website ?? '');
const inputs = {
first_search_property_name: 'name',
first_search_property_value: input.name,
name: input.name,
domain: domain ?? '',
website: input.website ?? '',
city: input.city ?? '',
country: input.country ?? '',
industry: input.hubspotIndustry ?? '',
numberofemployees: input.employeeCount ? String(input.employeeCount) : '',
description: input.description ?? '',
linkedin_company_page: input.linkedinUrl ?? '',
total_money_raised: input.totalFunding ? String(input.totalFunding) : '',
};
const { data: result } = await zapier.apps.hubspot.search_or_write.company_crmSearch({
inputs,
connectionId: HUBSPOT_CONNECTION_ID,
});
const [record] = zapierHubspotResponseSchema.parse(result);
return {
hubspotCompanyId: record.id,
action: record.isNew ? 'created' : 'updated',
};
},
});
types.ts
The Apollo response has many optional fields, so the schema uses .optional() generously. hubspotUpsertInputSchema extends the Apollo schema with a single hubspotIndustry field — the LLM-mapped value. The workflow output includes an action discriminator (created or updated) so callers know whether the upsert inserted a new record — useful for downstream triggers like "notify sales when a new company lands."
import { z } from '@outputai/core';
export const workflowInputSchema = z.object({
companyName: z.string().describe('The name of the company to enrich'),
website: z.string().url().describe('The company website URL (e.g. https://acme.com)'),
});
export const apolloCompanySchema = z.object({
name: z.string(),
website: z.string().optional(),
domain: z.string().optional(),
industry: z.string().optional(),
employeeCount: z.number().optional(),
estimatedRevenue: z.string().optional(),
description: z.string().optional(),
linkedinUrl: z.string().optional(),
city: z.string().optional(),
country: z.string().optional(),
keywords: z.array(z.string()).optional(),
totalFunding: z.number().optional().describe('Total funding raised in USD'),
latestFundingRound: z.string().optional(),
fundingStage: z.string().optional(),
});
export const workflowOutputSchema = z.object({
companyName: z.string(),
website: z.string(),
hubspotCompanyId: z.string(),
apolloData: apolloCompanySchema,
action: z.enum(['created', 'updated']),
});
export const hubspotUpsertInputSchema = apolloCompanySchema.extend({
hubspotIndustry: z.string().optional(),
});
The prompt
claude-haiku-4-5 is plenty for a constrained-vocabulary classification task. temperature: 0 keeps the mapping deterministic for the same input. The full HubSpot enum list is interpolated into the system message at runtime, so the model has the exact vocabulary it must pick from — no risk of it hallucinating an industry value that HubSpot will reject.
---
provider: anthropic
model: claude-haiku-4-5
temperature: 0
maxTokens: 256
---
<system>
You are an expert at mapping company industry strings to HubSpot's predefined industry ENUM values.
Given an industry category return the single best-matching HubSpot industry value.
Valid HubSpot industry values:
{{ hubspotIndustries }}
Rules:
- Return EXACTLY one value from the list above
- Pick the closest semantic match even if it's not an exact string match
- If no reasonable match exists, return the closest category
</system>
<user>
Map this industry to a HubSpot industry value:
Industry: {{ industry }}
</user>
About the Zapier SDK
The Zapier SDK is a TypeScript library that provides programmatic access to Zapier's 9,000+ app integrations. Instead of managing OAuth flows, token refresh, and per-app API quirks yourself, the SDK runs actions through the user's existing Zapier connections.
| Concept | Description |
|---|---|
| App Key | Identifier for an integrated app (hubspot, slack, google_calendar, ...) |
| Connection | A user-authenticated account linked to a specific app, identified by connection ID |
| Action |
search (find), write (create/update), read (list), or search_or_write (upsert) |
Authentication uses a client ID and secret pair:
import { createZapierSdk } from '@zapier/zapier-sdk';
const zapier = createZapierSdk({
credentials: { clientId: '...', clientSecret: '...' },
});
Actions are invoked through a chained zapier.apps.<appKey>.<actionType>.<actionKey>() pattern:
const { data: result } = await zapier.apps.hubspot.search_or_write.company_crmSearch({
inputs: {
first_search_property_name: 'name',
first_search_property_value: 'Stripe',
name: 'Stripe',
domain: 'stripe.com',
industry: 'COMPUTER_SOFTWARE',
},
connectionId: HUBSPOT_CONNECTION_ID,
});
Beyond running actions, the SDK exposes metadata helpers. listInputFieldChoices paginates through the live set of accepted values for a dropdown/enum input — so the workflow always sees the current vocabulary instead of a stale hardcoded list.
The pattern worth stealing
Any time you're writing to a downstream system with dropdown fields (lifecycle stage, lead source, deal stage, ticket priority), listInputFieldChoices + a cheap model makes your integration survive upstream vocabulary changes without a code release.
The broader pattern — direct API for rich enrichment data, Zapier SDK for the long tail of CRM writes, LLM for the semantic glue between them — scales to any integration where you need deep data from one source and broad reach into another.
Swap HubSpot for Salesforce, Pipedrive, or any of Zapier's 9,000+ integrated apps by changing a single app key. Or keep HubSpot and add a Slack step that notifies sales when action === 'created', so reps learn about new accounts the moment they land.
You can check the complete code here: https://github.com/growthxai/output-examples/tree/main/src/workflows/zapier_hubspot_company_enrichment
and if you like this tutorial check Output https://github.com/growthxai/output

