General file server: Maintenance and network changes on 29.08.2022 between 9 a.m. - 11:30 p.m.
==================
Affected Services:
==================
In addition to home directories, all services that use the general file server are affected. These services include: Ilias, web server, workgroup server, BSCW, NEMO, bwLehrpool, home directories, login server, shares / group drives and profiles (Windows).
===================================
MAINTENANCE 1 - Change network access
===================================
Until now all connections to our file server were established to 16 different frontend nodes. For a few weeks now we have been redirecting new connections to our new frontend nodes in the background. At the end of the year these 16 frontend nodes will be taken out of the system due to age. However, until now there are still some connections to our previous nodes. Therefore, please disconnect old connections if you have established a connection to the following IPs:
10.4.7.16
10.4.7.17
10.4.7.18
10.4.7.19
10.4.7.20
10.4.7.21
10.4.7.22
10.4.7.23
10.4.7.24
10.4.7.25
10.4.7.26
10.4.7.27
10.4.7.28
10.4.7.29
10.4.7.30
10.4.7.31
10.4.7.82
10.4.7.83
10.4.7.84
10.4.7.85
10.4.7.86
10.4.7.87
10.4.7.88
10.4.7.89
10.4.7.90
10.4.7.91
10.4.7.92
10.4.7.93
10.4.7.94
10.4.7.95
For SMB, this should normally be done automatically by subsequent maintenance. However, for NFS connections, this is not done automatically because the NFS client waits until the IP is available again. Please note that we will shut down the IPs and then your connection will no longer work.
Customers who have established a connection using dyn-ufr.isi1.public.ads.uni-freiburg.de are not affected by this, as the corresponding IPs have already been passed to the new storage nodes.
============================
MAINTENANCE 2 - Firmware update
============================
The firmware of a part of our storage nodes (16 old and 4 new storage nodes) needs to be updated. This process will take about 6 hours. If you have established a connection to our new nodes, the degradation will take about 2 hours. We can only update one new and one old storage node at a time. Updating the old storage nodes will take longer accordingly.
Impairments new nodes: approx. 9 - 11 a.m.
Impairments old nodes: approx. 9 - 4 p.m.
================================
MAINTENANCE 3 - Operating system update
================================
After the firmware update, the operating system (16 old, 4 new and 16 backend storage nodes) must be updated. This process will take approximately 7.5 hours. Again, we can only update one storage node from each storage pool at a time. The impairments for the new storage nodes are again expected to last for 2 hours, and impairments of the old storage nodes must be expected during the remaining time.
Impairment of new nodes: approx. 4 - 6 p.m.
Impairment of old nodes: approx. 4 - 11:30 p.m.
If the firmware update is ready earlier, we will start the operating system update correspondingly earlier.
==================
General Impact:
==================
Depending on how the various services are connected, outages will occur for varying lengths of time. Login, session and memory problems can therefore occur at any time within the maintenance window. If necessary, please refer to the protocol-specific notes, which are described below.
Each storage node is updated and restarted individually one after the other. The storage node is therefore not available for the duration of the update (approx. 20-30 minutes). Services whose protocol connection automatically switches to another storage node will only be affected for a short time. Services whose protocol connection does not allow automatic switching will therefore be unavailable for up to approx. 40 minutes. Since the individual storage nodes are updated at an arbitrary point in time, it is not possible to determine at which point in time the individual services will be affected.
Note for home directories / shares / group drives: For these services, the directory may be unavailable for a short time and access may hang. Depending on the timeout, access may be possible again after just a few minutes, so you simply have to wait a short while here. If access is still not possible after a longer time, you may have to establish a new connection manually.
In general, since each storage node is updated individually, services may be affected several times within the maintenance window.
Notes for the different protocols:
==================
Impact for NFSv3 customers using
use ufr-dyn.isi1.public.ads.uni-freiburg.de
==================
Customers who mount our storage system using NFSv3 via the ufr-dyn.isi1.public.ads.uni-freiburg.de URL should be minimally impacted by this procedure. The reason for this is that the IP of a storage node is automatically passed to another node as soon as the original node is unavailable. Therefore, we expect only a short latency to be noticed.
==================
Impact for all other clients (SMB + NFSv3/v4),
which use ufr.isi1.public.ads.uni-freiburg.de
(also applies to ufr2, fnet and phfr)
==================
For all customers who mount the storage area via ufr.isi1.public.ads.uni-freiburg.de (both SMB and NFSv3/v4), this procedure mainly means that at some point in time the node connecting to the storage system will be unavailable for the duration of the reboot / update (about 30 minutes). If necessary, a new connection to the storage system can be established manually / automatically immediately in order to connect to a new node. This can keep downtime to a minimum, although it can of course happen that a connection is made to a node which is updated later.
NFS/SMB: In case of a hard-mount, the connection will naturally hang until the storage node is available again. Please note the network changes from maintenance 1 and therefore please establish a new connection manually.
We apologize for any inconvenience and will do our best to minimize the impact.
Yours sincerely,
Your Storage Team