CosmicAC Logo

System Component

Core platform architecture, services, nodes, and component interactions. These components determine how jobs are submitted, scheduled, and executed across the CosmicAC infrastructure.


Shared Components

Components used across all job types:

ComponentDescriptionRole
app-uiWeb interfaceBrowser-based dashboard for job management
app-nodeApplication serverHandles HTTP API requests, authenticates users, and routes commands to the orchestrator
cosmicac-cliCommand-line interfaceSubmits jobs, manages resources, and connects to containers from the terminal
wrk-orkOrchestratorManages resource allocation, distributes jobs across the cluster, and routes requests to workers
wrk-server-k8s-nvidiaKubernetes GPU server workerManages GPU server provisioning and communicates with the Kubernetes cluster
k8s-control planeKubernetes control planeSchedules pods, allocates resources, and manages workload lifecycle

GPU Container Architecture

GPU container jobs run user workloads inside KubeVirt virtual machines with direct GPU access. Each container runs in an isolated VM, scheduled by Kubernetes and managed through the CosmicAC orchestration layer.

Components

ComponentDescriptionRole
wrk-agent-instanceInstance agentRuns inside the VM and exposes SSH access over Hyperswarm
VMIKubeVirt VirtualMachineInstanceVirtual machine instance managed by KubeVirt

Request Flow

  1. User submits a job via app-ui or cosmicac-cli.
  2. Request authenticates with app-node over HTTPS (ttr-token).
  3. app-node forwards the request to wrk-ork over HRPC.
  4. wrk-ork routes the job to wrk-server-k8s-nvidia over HRPC.
  5. wrk-server-k8s-nvidia instructs the k8s-control plane to schedule the workload.
  6. Kubernetes creates a pod containing a VMI with wrk-agent-instance.

Shell Access Flow

  1. User runs cosmicac jobs shell from cosmicac-cli.
  2. CLI connects directly to wrk-agent-instance over hyperswarm-ssh.

Managed Inference Architecture

Managed inference jobs run vLLM inside KubeVirt virtual machines and expose model endpoints through a proxy layer. The proxy handles API key verification, load balancing, and service discovery through a distributed hash table.

Components

ComponentDescriptionRole
proxy-inferenceInference proxyVerifies API keys, load balances requests, and routes to inference agents
wrk-agent-inferenceInference agentRuns vLLM inside the VM and handles inference requests over HRPC
HyperDB + AutobaseDistributed databaseStores API keys, usage metrics, and job metadata
dht tableDistributed hash tableEnables service discovery for inference agents
VMIKubeVirt VirtualMachineInstanceVirtual machine instance managed by KubeVirt

Request Flow (Job Creation)

  1. User submits a managed inference job via app-ui.
  2. Request authenticates with app-node over HTTPS (ttr-token).
  3. app-node forwards the request to wrk-ork over HRPC.
  4. wrk-ork routes the job to wrk-server-k8s-nvidia over HRPC.
  5. Kubernetes creates a pod containing a VMI with wrk-agent-inference.
  6. On spin-up, wrk-agent-inference registers itself to the dht table.

Request Flow (Inference)

  1. User sends an inference request via cosmicac-cli.
  2. Request authenticates with proxy-inference over HTTPS or HRPC (api-key).
  3. proxy-inference verifies the API key.
  4. proxy-inference queries the dht table (search by topic) to discover inference agents.
  5. The load balancer routes the request to a wrk-agent-inference over HRPC.
  6. wrk-agent-inference processes the request using vLLM and returns the response.

Protocols

ProtocolDescriptionUsage
HTTPSHTTP over TLSapp-ui/cli → app-node, cli/client → proxy-inference
HRPCHyperswarm RPCinternal service communication (app-node → wrk-ork → workers), cli → proxy-inference
hyperswarm-sshSSH over Hyperswarmcli → wrk-agent-instance

Authentication Methods

MethodDescriptionUsed By
ttr-tokenToken-based authentication over HTTPSapp-ui, cosmicac-cli → app-node
api-keyAPI key authenticationcosmicac-cli, HTTP clients → proxy-inference

Security with KubeVirt

CosmicAC uses KubeVirt to run user workloads in isolated virtual machines on Kubernetes. KubeVirt runs VMs in non-privileged pods, applies Kubernetes security controls (RBAC, SELinux, network policies), and exposes GPU devices through secure device plugins.

FeatureBenefit
Pod-level isolationSeparates each VM in its own namespace with SELinux enforcement
Non-privileged podsRuns VMs without elevated container privileges
GPU device pluginsExposes hardware without hostPath volumes or compromising isolation

What's next?

On this page