fleet icon indicating copy to clipboard operation
fleet copied to clipboard

Deploy & renew certificates on Linux workstations

Open ddribeiro opened this issue 1 year ago • 16 comments

  • customer-interkosmos: Gong snippet: TODO
    • customer-interkosmos promise. Order form here.
  • customer-cisneros: Gong call: https://us-65885.app.gong.io/call?id=6846455535794787581
    • @noahtalerman: Calling this one a cisneros promise. Not on the order form.
    • Slack discussion from 2025-06-20.
    • This call was cut short by network/Zoom issues. The 2nd part of of the call took place on WebEx and was converted to Gong below: https://us-65885.app.gong.io/call?id=3557819567386048079
    • Uploads of K's 4 part flow breakdown:
      • Part 1: https://us-65885.app.gong.io/call?id=3309772733099149092
      • Part 2: https://us-65885.app.gong.io/call?id=4029241620990511498
      • Part 3: https://us-65885.app.gong.io/call?id=3016742035690078737
      • Part 4: https://us-65885.app.gong.io/call?id=8284649856449574702
    • Flowchart provided by Cisneros of the current workflow: https://drive.google.com/file/d/1e_0YNXrIHCqgf92eX_iCAbw0O1GtbOq9/view?usp=share_link
    • More details in the now closed (duplicate) issue here: #27412
  • customer-pingouin: Gong snippet: https://us-65885.app.gong.io/call?id=3507336640763012987&highlights=%5B%7B%22type%22%3A%22SHARE%22%2C%22from%22%3A406%2C%22to%22%3A618%7D%5D
  • @noahtalerman: User requested this because they require that the end user's workstation has a certificate in order to access the corporate network.
    • @noahtalerman: In the interim pingouin uses certmonger to check certificate renewal times and trigger certificate renewal. They don't want to have to maintain the infra to manage this tool.
    • @noahtalerman: Eventually Fleet could deploy and renew certificates on Linux.
  • @noahtalerman: High level end user experience: Fleet runs a job every 5 minutes to check for expired certs. If one is expired, end user sees a prompt to enter their IdP password. End user enters their password and sees success notification when new certificate is delivered.
    • @noahtalerman: No end user interaction necessary if Fleet is delivering the certificates.
  • @noahtalerman: How this works under the hood today for customer-cisneros: code is hosted on fleetdm.com that calls Fleet to run a script (every 5 minutes job) to check certificate expiration and if certificate is expired, Fleet runs another script to prompt end user to enter their password (zenity) and the input is used to reach out to the IdP. IdP talks to PKI. PKI sends back a certificate and certificate is installed on the host.

User stories

  • Part 0: fleetdm.com endpoint for demo of Fleet replacing Lambda and eliminating 15 minute wait for end users

    • When? Ready now
  • Part 1: Native Fleet endpoint for replacing Lambda in production:

    • #28974
    • When? June 2025
  • Part 2: Native Fleet certificate UI and IdP integration to replace custom code

    • When? July 2025
  • #29577

ddribeiro avatar Aug 06 '24 18:08 ddribeiro

Thanks for tracking this @ddribeiro.

I think the plan is to use fleetdm.com (as a server that checks for expired certs) and script execution features in Fleet for now.

At some point in the future, when we have additional engineering capacity, we'll add this to Fleet.

cc @zwass @zayhanlon

noahtalerman avatar Aug 13 '24 00:08 noahtalerman

I think the plan is to use fleetdm.com (as a server that checks for expired certs) and script execution features in Fleet for now.

Confirmed w/ @zayhanlon. Removing from feature fest for now.

cc @ddribeiro

noahtalerman avatar Aug 13 '24 19:08 noahtalerman

@pboushy - FYI

phtardif1 avatar Sep 25 '24 19:09 phtardif1

@noahtalerman Here's one way to do this with certmonger and script execution. https://docs.portnox.com/topics/onboarding_uem_linux_scep. The solution involves local private key generation on the linux endpoint, which might be a blocker for some organizations that would prefer the Fleet server to do the brokerage of the CSR generation and renewal / revocation workflow.

dherder avatar Sep 26 '24 16:09 dherder

see #22452 @noahtalerman

dherder avatar Sep 27 '24 00:09 dherder

Here's one way to do this with certmonger and script execution. https://docs.portnox.com/topics/onboarding_uem_linux_scep. The solution involves local private key generation on the linux endpoint, which might be a blocker for some organizations that would prefer the Fleet server to do the brokerage of the CSR generation and renewal / revocation workflow.

Thanks @dherder :)

FYI @zwass and @zayhanlon re certificates on Linux

noahtalerman avatar Sep 27 '24 18:09 noahtalerman

@noahtalerman I think the major challenge here is to ensure that the fleetd has an understanding of the certificate revocation process and renewal. Just with script execution, I can't see how to do this with just script execution. The fleetd has to have some logic to be able to track issuance, revocation, and renewals.

dherder avatar Oct 02 '24 18:10 dherder

major challenge here is to ensure that the fleetd has an understanding of the certificate revocation process and renewal. Just with script execution, I can't see how to do this with just script execution.

@dherder I think customer-cisneros's plan to address this is to have a another server/third-party tool to handle tracking this. That tool hits the Fleet API to run scripts.

I could be wrong.

cc @zwass

noahtalerman avatar Oct 02 '24 21:10 noahtalerman

Thanks for the mention @phtardif1.

This is an area I've spent a lot of time researching both in Mac and Linux.... and there doesn't seem to be a good solution that works across multiple Linux distros.

For me, the most critical items are:

  1. Only devices that are enrolled in Fleet can request and retrieve a certificate. (this appears to be the most complex piece to script a solution because all the standards use static passphrases)
  2. No static passphrase used to request the certificate. Use some form of authentication that is unique per client device.
  3. The option to generate the private key and CSR locally on the device.
  4. Ideally the private key should be locked to the device that requested it. In Windows or Mac, you do this by storing it in TPM or SecureEnclave respectively.
  5. Monitor the expiration date of the certificate and kickoff a renewal process x% of the lifespan (e.g. renew 10% of the lifespan = 36/365days; 10/100days. This prevents the need to adjust as cert lifespan shrinks) - you can use a command like this: openssl pkcs12 -in "${cert}" -passin "${passin}" -nokeys -clcerts | openssl x509 -noout -enddate | cut -d = -f 2

For actually getting the certificate from a PKI environment, there are several options:

  • Fleet proxies the request for a certificate and either:
    1. uses a certiticate request standard to retrieve the cert (e.g. SCEP, ACME), or
    2. integrate with multiple PKI services (Symantec, Sectigo, DigiCert, AD CS, Venafi)
  • Configure the client to use SCEP or ACME - the biggest problem with this is that it usually fails critical item 1.

For certificate revocation, I believe that both SCEP and ACME support revocation through the APIs. If you instead were to integrate with multiple PKI services individual configurations, you would need to integrate the revocation for each individually.

pboushy avatar Oct 03 '24 01:10 pboushy

@noahtalerman I think the major challenge here is to ensure that the fleetd has an understanding of the certificate revocation process and renewal. Just with script execution, I can't see how to do this with just script execution. The fleetd has to have some logic to be able to track issuance, revocation, and renewals.

cc @dherder I could be very wrong but it kind of seems like we already have the mechanism for watching attributes of any cert on any device, ie, the certificates osquery table which is cross-platform...

The new part would be that in the Fleet db there could be a table that stored expiry for "Fleet-issued" certs & when the expiry reached some safe period prior to expiration like 14d, a command could be issued to renew.

nonpunctual avatar Oct 10 '24 19:10 nonpunctual

Moving the original issue description here for safekeeping:

Problem

As an IT admin, I'd like Fleet to orchestrate the lifecycle management of client certificates on my Linux hosts.

There are several parts to this request that might need to be broken down into smaller stories:

  1. Create a system to generate and issue client certificates
  2. Automatically renew client certificates when they are about to expire
  3. Have the ability to revoke client certificates through Fleet
  4. Prevent a device that has had its certificate revoked from generating a new one if it gets re-imaged (persist device record in Fleet based on UUID or other identifier)

What have you tried?

customer-cisneros is using scripts and Ubuntu Landscape to achieve this today.

  1. A script is used to generate a private key on device
  2. A script is used to generate a CSR on device
  3. A script is used to pull the CSR from the device and submit it to a PKI service
  4. Certificate is generated on the server and deployed to client device using Landscape

This workflow is able to renew certificates before they expire. It does not handle revocation.

Potential solutions

The solution for the customer would be to build a system in Fleet that replaces their current workflow and meets the requirements in the above sections. I don't have any solutions on how to best achieve this.

What is the expected workflow as a result of your proposal?

The expected workflow is that a Fleet admin would be able to use Fleet to manage client certificates on their Linux hosts instead of needed to build a custom workflow to handle this.

noahtalerman avatar Jan 20 '25 19:01 noahtalerman

  • customer-cisneros: Gong call: https://us-65885.app.gong.io/call?id=6846455535794787581
    • This call was cut short by network/Zoom issues. The 2nd part of of the call took place on WebEx and was converted to Gong below: https://us-65885.app.gong.io/call?id=3557819567386048079
    • Uploads of Kadar's 4 part flow breakdown:
      • Part 1: https://us-65885.app.gong.io/call?id=3309772733099149092
      • Part 2: https://us-65885.app.gong.io/call?id=4029241620990511498
      • Part 3: https://us-65885.app.gong.io/call?id=3016742035690078737
      • Part 4: https://us-65885.app.gong.io/call?id=8284649856449574702
    • Flowchart provided by Cisneros of the current workflow: https://drive.google.com/file/d/1e_0YNXrIHCqgf92eX_iCAbw0O1GtbOq9/view?usp=share_link
    • More details in the now closed (duplicate) issue here: #27412

FYI @ddribeiro I closed this other request as a duplicate and moved the great cisneros findings into this issue's description.

noahtalerman avatar Apr 02 '25 17:04 noahtalerman

Hey @marko-lisica, when you're back, can you please file a user story for this request and bring it on to the drafting board? We prioritized it at the last feature fest.

noahtalerman avatar May 05 '25 14:05 noahtalerman

@zayhanlon can you please help us schedule a call with interkosmos?

noahtalerman avatar May 15 '25 19:05 noahtalerman

@ddribeiro Would you please ask for a higher-res version of their workflow diagram? Unfortunately it's not legible at the current size.

lukeheath avatar May 19 '25 16:05 lukeheath

Moving the old user stories list out of the issue description to below. User stories that contribute to this request now live in the "Sub-issues" section.

User stories

  • Part 0: fleetdm.com endpoint for demo of Fleet replacing Lambda and eliminating 15 minute wait for end users

    • When? Ready now
  • Part 1: Native Fleet endpoint for replacing Lambda in production:

    • #28974
    • When? June 2025
  • Part 2: Native Fleet certificate UI and IdP integration to replace custom code

    • When? July 2025
  • #29577

noahtalerman avatar Jul 15 '25 13:07 noahtalerman

@noahtalerman added snippets for nuptel and montague per your request. The montague snippet provides a very in depth explanation of the reasoning why and also how they did it on their end.

tl;dr: Linux users typically have root access. In these instances, there is nothing stopping a user from copying the certificate + private key or SCEP secret to another device and potentially gaining access to company resources on a non-company owned device which doesn't have the same protections and monitoring in place.

kc9wwh avatar Jul 30 '25 16:07 kc9wwh

Hey @pintomi1989 we shipped support for configuring Hydrant as a certificate authority and writing a custom script, that hits Fleet's API, to request/renew EST certificates: https://fleetdm.com/guides/connect-end-user-to-wifi-with-certificate#hydrant

We think this fulfill's cisneros's needs. That said, this workflow is complex so we want to get feedback ASAP to shorten the feedback loop in case Fleet is missing anything. Can you please schedule a call w/ the customer and I to setup the workflow together?

noahtalerman avatar Oct 13 '25 14:10 noahtalerman

Hey @noahtalerman

Will do. It looks like you are already on the invite list for our meeting this Friday - I can add a bullet point for this discussion then

pintomi1989 avatar Oct 14 '25 12:10 pintomi1989

@kc9wwh can you get a Gong snippet for shackleton? we're currently working on all kinds of variations for this but we want to ensure that we deliver what they're asking for

zayhanlon avatar Nov 03 '25 17:11 zayhanlon

@kc9wwh can you get a Gong snippet for shackleton? we're currently working on all kinds of variations for this but we want to ensure that we deliver what they're asking for

@zayhanlon @noahtalerman - Gong for shackleton attached.

kc9wwh avatar Nov 17 '25 15:11 kc9wwh