-

2009 hardware purchase proposal

From Open Bioinformatics Foundation
Revision as of 16:36, 14 December 2009 by Dag (talk) (Justification for new Email Scanning Appliance)
Jump to: navigation, search

Proposal to purchase new machines for OBF

Justification for total replacement of existing servers

Total Server Refresh - Move to 100% hypervisor-based virtualization

  • Existing servers date back to 2004 and are based on 32bit Pentium 4 chipsets
  • Existing servers running CentOS 4.8, current CentOS is at version 5.4
  • No spare server capacity for new service and server requests from OBF community
  • Only two people (Chris D and Jason S) have 100% full remote control including remote-power reboot ability
    • Can not easily/securely provide full remote control to other volunteers with existing infrastructure
  • We need to move to 64 bits, x86_64 and take advantage of CPU level support for virtulization
  • Virtualizing our servers and services greatly reduces operational burden, makes our IT infrastructure more "portable" should future situations demand it and also solves much of our existing issues with granting high levels of remote admin access (including console & remote power control) to members of our sysadmin team
  • New 64-bit hardware with CPU level virtualization support would allow OBF to:
    • Run many more servers and services as needed
    • Provide higher levels of performance
    • Provide higher levels of redundancy, safety and portability
    • Allow much greater distribution of administrative powers
    • Consume less datacenter space
    • Consume less datacenter electricity

Justification for new Email Scanning Appliance

New Purchase - Mail scanning appliance

This proposal is to solve a problem that has long been somewhat invisible to the OBF community - the extremely large amount of work required to handle the massive volume of email traffic that our lists receive, particularly in dealing with and moderating emails that are clearly spam but have gotten through our own anti-spam and anti-virus filters. These messages end up in the moderator queues of all our mailing lists (dozens per day, per email list) and represent the single largest operational and administrative burden for OBF volunteers.

The fact is that in 2009 the free and open source methods for anti-spam and anti-virus can not keep up with the current methods used by the bad guys. The most common source of spam these days are compromised PC systems that send email in small volumes, via seemingly legit accounts and at rotating intervals. The PC-based botnets are much harder to block via greylisting and blocklists than the previous generation of spammers who preferred sending large volumes of email through a smaller collection of hosts.

Current methods of anti-spam and anti-virus include:

  • Greylisting in effect on all inbound email from unknown senders
  • All inbound/outbound email scanned by clamAV for viral payloads
  • Email that passes the clamAV test gets routed through MIMEDefang for additional scrutiny
  • Email that passes clamAV and MIMEDefang gets processed by SpamAssassin
  • A high SA score causes the email to be discarded automatically

Even with the above methods in place, a huge amount of spam still gets through and clogs up the moderator queues of our very active mailing lists.

Bad effects of the spam deluge:

  • Overworked volunteer list administrators
  • Legit emails being lost or deleted in bulk moderator cleanup attempts
  • Once open mailing lists have had to become more closed and harder to reach
  • Overall our mailing list community is not as open/accessible as it has been in the past, due to the anti-spam defenses we've had to emplace

Proposed purchases

Proposals exist to purchase two main infrastructure devices:

Server Quotes

Deployment and upgrade plan

  • Decomission old machines, helps free up donated rackspace from BioTeam
  • Setting up worldwide mirrors for backup purposes (rsync scripts from the repository?)
    • Our 2 most valuable components are src code and mailing list (archives and membership). Losing these or being down is a HUGE problem. Can we insure these are protected and redundantly preserved?
  • How can we balancing security, all volunteer sysadmin team, moderate latency in response to issues (due to all volunteer nature)

Community requests

  • Latest and greatest src code tools - GIT/Mercurial, previously Trac was also requested.
    • Has been problematic to support because we currently don't allow HTTP access to dev machine
    • Can we setup NFS + httpd on separate machine with mirrored FS (read-only) or NFS(read-write) or other system?
  • How do we keep Wiki's up-to-date w software - better wikifarm support?