CLVM enhancements and fixes#12617
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #12617 +/- ##
=============================================
- Coverage 17.90% 3.68% -14.23%
=============================================
Files 5938 454 -5484
Lines 532864 38798 -494066
Branches 65192 7151 -58041
=============================================
- Hits 95392 1428 -93964
+ Misses 426793 37183 -389610
+ Partials 10679 187 -10492
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5fc1f12 to
9e03f4b
Compare
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16801 |
a08e7a5 to
df61d6f
Compare
df61d6f to
43e9384
Compare
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #12617 +/- ##
=============================================
- Coverage 18.75% 3.53% -15.23%
=============================================
Files 6160 464 -5696
Lines 552578 40203 -512375
Branches 67348 7563 -59785
=============================================
- Hits 103660 1421 -102239
+ Misses 437512 38592 -398920
+ Partials 11406 190 -11216
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
3f900e8 to
c9dd7ed
Compare
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16875 |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16877 |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18195 |
|
@blueorangutan test |
|
@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18198 |
|
@blueorangutan test |
|
@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
JoaoJandre
left a comment
There was a problem hiding this comment.
I was only able to review half the files. I will do the rest later.
Also, could you point to a documentation somewhere on how to setup a CLVM_NG storage to add it to ACS? I was not able to find it online
| } | ||
| } | ||
|
|
||
| updateClvmLockHostForVmVolumes(vm.getId(), dstHostId); |
There was a problem hiding this comment.
If the result is false, or null, we still update the lock? How do we know if this update will be correct?
If the host does not execute the command (the command queue is full and the command times out, or the host is restarted) we will update the lock metadata to point to the wrong host, correct?
There was a problem hiding this comment.
if it does, in the next re-try it will get corrected.
| return new MigrateCommand.MigrateDiskInfo(srcVolumeInfo.getPath(), | ||
| MigrateCommand.MigrateDiskInfo.DiskType.FILE, | ||
| MigrateCommand.MigrateDiskInfo.DriverType.QCOW2, | ||
| MigrateCommand.MigrateDiskInfo.Source.FILE, | ||
| destPath, | ||
| backingPath); |
There was a problem hiding this comment.
Is this a style change only? Is it needed?
There was a problem hiding this comment.
this was just cosmetic for better readability.
| String details = preMigAnswer != null ? preMigAnswer.getDetails() : "null answer returned"; | ||
| logger.warn("PreMigrationCommand failed for CLVM/CLVM_NG VM [{}] on source host [{}]: {}. Migration will continue but may fail if volumes are exclusively locked.", vmTO.getName(), srcHost.getId(), details); |
There was a problem hiding this comment.
Shouldn't we abort the migration in this case?
If we continue and it does not work, it could lead to problems in the VM, no?
There was a problem hiding this comment.
No - if it fails, it will behave like any VM migration failure. The source VM / volume goes back to original state.
It can be found at: apache/cloudstack-documentation#637 |
|
@blueorangutan package |
|
@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
| // Check if CLVM lock transfer is needed (even if moveVolumeNeeded is false) | ||
| // This handles the case where the volume is already on the correct storage pool | ||
| // but the VM is running on a different host, requiring only a lock transfer | ||
| boolean isClvmLockTransferNeeded = !moveVolumeNeeded && | ||
| isClvmLockTransferRequired(newVolumeOnPrimaryStorage, existingVolumeOfVm, vm); | ||
|
|
||
| if (isClvmLockTransferNeeded) { | ||
| // CLVM lock transfer - no data copy, no pool change needed | ||
| newVolumeOnPrimaryStorage = executeLightweightLockMigration( | ||
| newVolumeOnPrimaryStorage, vm, existingVolumeOfVm, | ||
| "CLVM lock transfer", "same pool to different host"); | ||
| } else if (moveVolumeNeeded) { | ||
| PrimaryDataStoreInfo primaryStore = (PrimaryDataStoreInfo)newVolumeOnPrimaryStorage.getDataStore(); | ||
| if (primaryStore.isLocal()) { | ||
| throw new CloudRuntimeException( | ||
| "Failed to attach local data volume " + volumeToAttach.getName() + " to VM " + vm.getDisplayName() + " as migration of local data volume is not allowed"); | ||
| } | ||
| StoragePoolVO vmRootVolumePool = _storagePoolDao.findById(existingVolumeOfVm.getPoolId()); | ||
|
|
||
| try { | ||
| HypervisorType volumeToAttachHyperType = _volsDao.getHypervisorType(volumeToAttach.getId()); | ||
| newVolumeOnPrimaryStorage = _volumeMgr.moveVolume(newVolumeOnPrimaryStorage, vmRootVolumePool.getDataCenterId(), vmRootVolumePool.getPodId(), vmRootVolumePool.getClusterId(), | ||
| volumeToAttachHyperType); | ||
| } catch (ConcurrentOperationException | StorageUnavailableException e) { | ||
| logger.debug("move volume failed", e); | ||
| throw new CloudRuntimeException("move volume failed", e); | ||
| boolean isClvmLightweightMigration = isClvmLightweightMigrationNeeded( | ||
| newVolumeOnPrimaryStorage, existingVolumeOfVm, vm); | ||
|
|
||
| if (isClvmLightweightMigration) { | ||
| newVolumeOnPrimaryStorage = executeLightweightLockMigration( | ||
| newVolumeOnPrimaryStorage, vm, existingVolumeOfVm, | ||
| "CLVM lightweight migration", "different pools, same VG"); | ||
| } else { | ||
| StoragePoolVO vmRootVolumePool = _storagePoolDao.findById(existingVolumeOfVm.getPoolId()); | ||
|
|
||
| try { | ||
| HypervisorType volumeToAttachHyperType = _volsDao.getHypervisorType(volumeToAttach.getId()); | ||
| newVolumeOnPrimaryStorage = _volumeMgr.moveVolume(newVolumeOnPrimaryStorage, vmRootVolumePool.getDataCenterId(), vmRootVolumePool.getPodId(), vmRootVolumePool.getClusterId(), | ||
| volumeToAttachHyperType); | ||
| } catch (ConcurrentOperationException | StorageUnavailableException e) { | ||
| logger.debug("move volume failed", e); | ||
| throw new CloudRuntimeException("move volume failed", e); | ||
| } | ||
| } |
There was a problem hiding this comment.
Can this volume move section be extracted to a separate method to reduce indentation?
There was a problem hiding this comment.
Would it be ok if I addressed this in another PR - with other enhancements that are yet to come.
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 18208 |


Description
This PR enhances the existing CLVM implementation which was based on the deprecated CLVM technology which was based on corosync/pacemaker. With RHEL 7 having reached EOL, CLVM seems to be broken. CLVM supports RAW volumes on LVM , where as CLVM_NG support QCOW2 on LVM.
Further details: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Modernized+CLVM%3A+Enhancements+and+CLVM_NG+support
NOTE: On testing - it was identified that incremental snapshots for clvm-ng do not work as expected. As of now it's been removed from scope. So, CLVM and CLVM_NG would only support full snapshots.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?