Lancium is seeking a Linux Systems Administrator to help maintain our High-Throughput Compute Grid in our Houston Texas location. We are on track to grow our infrastructure to over 15,000 CPU cores and 1000 GPUs by the end of the year. If you are experienced in large-scale infrastructure and looking to be involved in the early stages of a new cloud offering focused on High Throughput Computing, this might be the right place for you.
- Responsible for the installation, configuration and monitoring of the Houston High-Throughput Compute Servers.
- Troubleshoots server performance issues and reviews server logs for root cause analysis.
- Responsible for creating and maintaining documentation needed to support the business processes.
- Training and assisting other employees to assist as needed.
- Familiarity with bash, Git, and Git-based workflows
- Familiarity with the Linux systems administration
- Experience with hardware trouble shooting
- Flexibility to readily respond to changing circumstances and expectations; open to new ideas and procedures.
- High motivation to work with minimal supervision in a collaborative environment.
- Strong organization and time management skills, with the ability to prioritize and triage workflow.
- Must be able to lift 50 pounds of computer/network equipment when necessary
- Knowledge of management / deployment tools
- Familiarity with containerization technologies
- Experience with deployment and monitoring of large compute clusters
- Experience with Scientific, High-Throughput or High-Performance Computing environments
- Knowledge of Network configuration (VLAN, VXLAN, BGP, Domains, DNS, DHCP)
The Linux Systems Administrator will primarily work in Lancium’s Houston office. Work locations may include data centers with both climate-controlled and non-climate-controlled conditions. There may be exposure to extreme temperatures, noise and vibration, and mechanical or electrical hazards.
- Health Insurance
- Dental Insurance
- Vision Insurance
- Life Insurance
- Voluntary Short and Long Term Disability Insurance
- Paid Holidays and Time Off
Lancium is a technology company creating software and technical solutions that enable the faster growth of renewable energy. Our products include Lancium Smart Response™ for server power management, the Lancium Compute Platform for high throughput computing applications and Lancium Clean Compute Centers™ that absorb excess renewable energy. These solutions help ensure that renewable energy can power our future.
Lancium’s technical headquarters are located in Northwest Houston. Sucessful candidates will join a dynamic company with considerable room for advancement as the company expands into new markets and geographies.