diff --git a/.gitignore b/.gitignore index a08cff9b..254c0cc5 100644 --- a/.gitignore +++ b/.gitignore @@ -10,3 +10,6 @@ helix-core-api/ # Depot conversion results clones/ *.log + +# Testing failures +core diff --git a/README.md b/README.md index 40be9100..4b28f6f4 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ This tool solves some of the most impactful scaling and performance limitations ## Performance -Please be aware that this tool is fast enough to instantaneously generate a tremendous amount of load on your Perforce server (more than 150K requests in a few seconds if running with a couple hundred network threads). Since p4-fusion will continue generating load within the limits set using the runtime arguments, it needs careful monitoring to ensure that your Perforce server does not get impacted. +Please be aware that this tool is fast enough to instantaneously generate a tremendous amount of load on your Perforce server (more than 150K requests in a few seconds if running with a couple hundred network threads). Since p4-fusion will continue generating load within the limits set using the runtime arguments, it needs careful monitoring to ensure that your Perforce server does not get impacted. However, having no rate limits and running this tool with several hundred network threads (or more if possible) is the ideal case for achieving maximum speed in the conversion process. @@ -44,6 +44,12 @@ These execution times are expected to scale as expected with larger depots (mill --lookAhead [Required] How many CLs in the future, at most, shall we keep downloaded by the time it is to commit them? +--branch [Optional] + A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide the git branch alias. + +--noMerge [Optional, Default is false] + When false and at least one branch is given, then . If this is true, then the Git history will not contain any merges, except for an artificial empty commit added at the root, which acts as a common source to make later merges easier. + --maxChanges [Optional, Default is -1] Specify the max number of changelists which should be processed in a single run. -1 signifies unlimited range. @@ -75,6 +81,28 @@ These execution times are expected to scale as expected with larger depots (mill Specify which P4USER to use. Please ensure that the user is logged in. ``` +## Notes On Branches + +When at least one branch argument exists, the tool will enable branching mode. + +Branching mode currently only supports very simple branch layouts. The format must be `//common/depot/path/branch-name`. The common depot path is given as the `--path` argument, and each `--branch` argument specifies one branch name to inspect. Branch names must be a directory name immediately after the path (it replaces the `...`). + +In branching mode, the generated Git repository will be initially populated with a zero-content commit. This allows branches to later be merged without needing the `--allow-unrelated-histories` flag in Git. All branches will have this in their history. + +If a Perforce changelist contains an integration like action (move, integrate, copy, etc.) from another branch listed in a `--branch` argument, then the tool will mark the Git commit with the integration as having two parents - the current branch and the source branch. If a changelist contains integrations into one branch from multiple other branches, they are put into separate commits, each with just one source branch. If a changelist contains integrations into multiple branches, then each one of those is also its own commit. + +Because Perforce integration isn't a 1-to-1 mapping onto Git merge, there can be situations where having the tool mark a commit as a merge, but not bringing over all the changes, leads to later merge logic not picking up every changed file correctly. To avoid this situation, the `--noMerge true` will ensure they only have the single zero-content root commit shared, so any merge done after the migration will force full file tree inspection. + +If the Perforce tree contains sub-branches, such as `//base/tree/sub` being a sub-branch of `//base/tree`, then you can use the arguments `--path //base/... --branch tree/sub:tree-sub --branch tree`. The ordering is important here - provide the deeper paths first to have them take priority over the others. Because Git creates branches with '/' characters as implicit directories, you must provide the Git branch alias to prevent Git reporting an error where the branch "tree" can't be created because is already a directory, or "tree/sub" can't be created because "tree" isn't a directory. + +## Checking Results + +In order to test the validity of the logic, we need to run the program over a Perforce depot and compare each changelist against the corresponding Git commit SHA, to ensure the files match up. + +The provided script [validate-migration.sh](validate-migration.sh) runs through every generated Git commit, and ensures the file state exactly matches the state of the Perforce depot. + +Because of the extra effort the script performs, expect it to take orders of magnitude longer than the original p4-fusion execution. + ## Build 0. Pre-requisites diff --git a/p4-fusion/branch_set.cc b/p4-fusion/branch_set.cc new file mode 100644 index 00000000..3fb06c3a --- /dev/null +++ b/p4-fusion/branch_set.cc @@ -0,0 +1,281 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ +#include "branch_set.h" +#include + +static const std::string EMPTY_STRING = ""; +static const std::array INVALID_BRANCH_PATH { EMPTY_STRING, EMPTY_STRING }; + +std::vector BranchedFileGroup::GetRelativeFileNames() +{ + std::vector ret; + for (auto& fileData : files) + { + ret.push_back(fileData.GetRelativePath()); + } + return ret; +} + +ChangedFileGroups::ChangedFileGroups() + : totalFileCount(0) +{ +} + +ChangedFileGroups::ChangedFileGroups(std::vector& groups, int totalFileCount) + : totalFileCount(totalFileCount) +{ + branchedFileGroups = std::move(groups); +} + +void ChangedFileGroups::Clear() +{ + for (auto& fileGroup : branchedFileGroups) + { + for (auto& file : fileGroup.files) + { + file.Clear(); + } + fileGroup.files.clear(); + fileGroup.sourceBranch.clear(); + fileGroup.targetBranch.clear(); + } + totalFileCount = 0; +} + +Branch::Branch(const std::string& branch, const std::string& alias) + : depotBranchPath(branch) + , gitAlias(alias) +{ + if (depotBranchPath.empty()) + { + throw std::invalid_argument("branch name is empty"); + } + if (gitAlias.empty()) + { + throw std::invalid_argument("branch alias is empty"); + } +} + +std::array Branch::SplitBranchPath(const std::string& relativeDepotPath) const +{ + if ( + // The relative depot branch, to match this branch path, must start with the + // branch path + "/". The "StartsWith" is put at the end of the 'and' checks, + // because it takes the longest. + relativeDepotPath.size() > depotBranchPath.size() + && relativeDepotPath[depotBranchPath.size()] == '/' + && STDHelpers::StartsWith(relativeDepotPath, depotBranchPath)) + { + return { gitAlias, relativeDepotPath.substr(depotBranchPath.size() + 1) }; + } + return { "", "" }; +} + +Branch createBranchFromPath(const std::string& depotBranchPath) +{ + std::string branchPath = std::string(depotBranchPath); + std::string alias = std::string(depotBranchPath); + + // The formatting using a ':' to separate the branch path from the git alias MUST be + // the last ':' in the string. This allows the command to work with branch paths that contain + // a ':' character, as long as the git alias does NOT contain a ':', and it implies that the git + // alias MUST be given. + size_t pos = depotBranchPath.rfind(':'); + if (pos > 0 && depotBranchPath.size() > pos) + { + branchPath.erase(pos); + alias.erase(0, pos + 1); + } + + STDHelpers::StripSurrounding(branchPath, '/'); + STDHelpers::StripSurrounding(alias, '/'); + return Branch(branchPath, alias); +} + +std::vector createBranchesFromPaths(const std::vector& branches) +{ + std::vector parsed; + for (auto& branch : branches) + { + parsed.push_back(createBranchFromPath(branch)); + } + return parsed; +} + +BranchSet::BranchSet(std::vector& clientViewMapping, const std::string& baseDepotPath, const std::vector& branches, const bool includeBinaries) + : m_branches(createBranchesFromPaths(branches)) + , m_includeBinaries(includeBinaries) +{ + m_view.InsertTranslationMapping(clientViewMapping); + if (STDHelpers::EndsWith(baseDepotPath, "/...")) + { + // Keep the final '/'. + m_basePath = baseDepotPath.substr(0, baseDepotPath.size() - 3); + } + else if (baseDepotPath.back() != '/') + { + throw std::invalid_argument("Bad base depot path format: " + baseDepotPath); + } + else + { + m_basePath = baseDepotPath; + } +} + +std::array BranchSet::splitBranchPath(const std::string& relativeDepotPath) const +{ + // Check if the relative depot path starts with any of the branches. + // This checks the branches in their stored order, which can mean that having a branch + // order like "//a/b/c" and "//a/b" will only work if the sub-branches are listed first. + // To do this properly, the stored branches should be scanned based on their length - longest + // first, but that's extra processing and code for a use case that is rare and has a manual + // work around (list branches in a specific order). + for (auto& branch : m_branches) + { + auto split = branch.SplitBranchPath(relativeDepotPath); + if (!split[0].empty() && !split[1].empty()) + { + return split; + } + } + return { "", "" }; +} + +std::string BranchSet::stripBasePath(const std::string& depotPath) const +{ + if (STDHelpers::StartsWith(depotPath, m_basePath)) + { + // strip off the leading '/', too. + return depotPath.substr(m_basePath.size()); + } + return EMPTY_STRING; +} + +struct branchIntegrationMap +{ + std::vector branchGroups; + std::unordered_map branchIndicies; + int fileCount = 0; + + void addMerge(const std::string& sourceBranch, const std::string& targetBranch, const FileData& rev); + void addTarget(const std::string& targetBranch, const FileData& rev); + + // note: not const, because it cleans out the branchGroups. + std::unique_ptr createChangedFileGroups() { return std::unique_ptr(new ChangedFileGroups(branchGroups, fileCount)); }; +}; + +void branchIntegrationMap::addTarget(const std::string& targetBranch, const FileData& fileData) +{ + addMerge(EMPTY_STRING, targetBranch, fileData); +} + +void branchIntegrationMap::addMerge(const std::string& sourceBranch, const std::string& targetBranch, const FileData& fileData) +{ + // Need to store this in the integration map, using "src/tgt" as the + // key. Because stream names can't have a '/' in them, this creates a unique key. + // source might be empty, and that's okay. + const std::string mapKey = sourceBranch + "/" + targetBranch; + const auto entry = branchIndicies.find(mapKey); + if (entry == branchIndicies.end()) + { + const int index = branchGroups.size(); + branchIndicies.insert(std::make_pair(mapKey, index)); + branchGroups.push_back(BranchedFileGroup()); + BranchedFileGroup& bfg = branchGroups[index]; + bfg.sourceBranch = sourceBranch; + bfg.targetBranch = targetBranch; + bfg.hasSource = !sourceBranch.empty(); + bfg.files.push_back(fileData); + } + else + { + branchGroups.at(entry->second).files.push_back(fileData); + } + fileCount++; +} + +// Post condition: all returned FileData (e.g. filtered for git commit) have the relativePath set. +std::unique_ptr BranchSet::ParseAffectedFiles(const std::vector& cl) const +{ + branchIntegrationMap branchMap; + for (auto& clFileData : cl) + { + FileData fileData(clFileData); + + // First, filter out files we don't want. + const std::string& depotFile = fileData.GetDepotFile(); + if ( + // depot file should always be present. + // The left side of the client view is the depot side. + !m_view.IsInLeft(depotFile) + || (!m_includeBinaries && fileData.IsBinary()) + || STDHelpers::Contains(depotFile, "/.git/") // To avoid adding .git/ files in the Perforce history if any + || STDHelpers::EndsWith(depotFile, "/.git") // To avoid adding a .git submodule file in the Perforce history if any + ) + { + continue; + } + std::string relativeDepotPath = stripBasePath(depotFile); + if (relativeDepotPath.empty()) + { + // Not under the depot path. Shouldn't happen due to the way we + // scan for files, but... + continue; + } + + // If we have branches, then possibly sort the file into a branch group. + if (HasMergeableBranch()) + { + // [0] == branch name, [1] == relative path in the branch. + std::array branchPath = splitBranchPath(relativeDepotPath); + if ( + branchPath[0].empty() + || branchPath[1].empty()) + { + // not a valid branch file. skip it. + continue; + } + + // It's a valid destination to a branch. + // Make sure the relative path is set. + fileData.SetRelativePath(branchPath[1]); + + bool needsHandling = true; + if (fileData.IsIntegrated()) + { + // Only add the integration if the source is from a branch we care about. + // [0] == branch name, [1] == relative path in the branch. + std::array fromBranchPath = splitBranchPath(stripBasePath(fileData.GetFromDepotFile())); + if ( + !fromBranchPath[0].empty() + && !fromBranchPath[1].empty() + + // Can't have source and target be pointing to the same branch; that's not + // a branch operation in the Git sense. + && fromBranchPath[0] != branchPath[0]) + { + // This is a valid integrate from a known source to a known target branch. + branchMap.addMerge(fromBranchPath[0], branchPath[0], fileData); + needsHandling = false; + } + } + if (needsHandling) + { + // Either not a valid integrate, or a normal operation. + branchMap.addTarget(branchPath[0], fileData); + } + } + else + { + // It's a non-branching setup. + // Make sure the relative path is set. + fileData.SetRelativePath(relativeDepotPath); + branchMap.addTarget(EMPTY_STRING, fileData); + } + } + return branchMap.createChangedFileGroups(); +} diff --git a/p4-fusion/branch_set.h b/p4-fusion/branch_set.h new file mode 100644 index 00000000..5f0e607a --- /dev/null +++ b/p4-fusion/branch_set.h @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ +#pragma once + +#include +#include +#include +#include +#include + +#include "commands/file_map.h" +#include "commands/file_data.h" +#include "utils/std_helpers.h" + +struct BranchedFileGroup +{ + // If a BranchedFiles collection hasSource == true, + // then all files in this collection MUST be a merge + // from the given source branch to the target branch. + // These branch names will be the Git branch names. + std::string sourceBranch; + std::string targetBranch; + bool hasSource; + std::vector files; + + // Get all the relative file names from each of the file data. + std::vector GetRelativeFileNames(); +}; + +struct ChangedFileGroups +{ +private: + ChangedFileGroups(); + +public: + std::vector branchedFileGroups; + int totalFileCount; + + // When all the file groups have finished being used, + // only then can we safely clear out the data. + void Clear(); + + ChangedFileGroups(std::vector& groups, int totalFileCount); + + static std::unique_ptr Empty() { return std::unique_ptr(new ChangedFileGroups); }; +}; + +struct Branch +{ +public: + const std::string depotBranchPath; + const std::string gitAlias; + + Branch(const std::string& branch, const std::string& alias); + + // splitBranchPath If the relativeDepotPath matches, returns {branch alias, branch file path}. + // Otherwise, returns {"", ""} + std::array SplitBranchPath(const std::string& relativeDepotPath) const; +}; + +// A singular view on the branches and a base view (acts as a filter to trim down affected files). +// Maps a changed file state to a list of resulting branches and affected files. +struct BranchSet +{ +private: + // Technically, these should all be const. + const bool m_includeBinaries; + std::string m_basePath; + const std::vector m_branches; + FileMap m_view; + + // stripBasePath remove the base path from the depot path, or "" if not in the base path. + std::string stripBasePath(const std::string& depotPath) const; + + // splitBranchPath extract the branch name and path under the branch (no leading '/' on the path) + // relativeDepotPath - already stripped from running stripBasePath. + std::array splitBranchPath(const std::string& relativeDepotPath) const; + +public: + BranchSet(std::vector& clientViewMapping, const std::string& baseDepotPath, const std::vector& branches, const bool includeBinaries); + + // HasMergeableBranch is there a branch model that requires integration history? + bool HasMergeableBranch() const { return !m_branches.empty(); }; + + int Count() const { return m_branches.size(); }; + + // ParseAffectedFiles create collections of merges and commits. + // Breaks up the files into those that are within the view, with each item in the + // list is its own target Git branch. + // This also has the side-effect of populating the relative path value in the file data. + // ... the FileData object is copied, but it's underlying shared data is shared. So, this + // breaks the const. + std::unique_ptr ParseAffectedFiles(const std::vector& cl) const; +}; diff --git a/p4-fusion/commands/change_list.cc b/p4-fusion/commands/change_list.cc index 4a5abf3d..06646507 100644 --- a/p4-fusion/commands/change_list.cc +++ b/p4-fusion/commands/change_list.cc @@ -8,6 +8,7 @@ #include "p4_api.h" #include "describe_result.h" +#include "filelog_result.h" #include "print_result.h" #include "utils/std_helpers.h" @@ -18,17 +19,35 @@ ChangeList::ChangeList(const std::string& clNumber, const std::string& clDescrip , user(userID) , description(clDescription) , timestamp(clTimestamp) + , changedFileGroups(ChangedFileGroups::Empty()) { } -void ChangeList::PrepareDownload() +void ChangeList::PrepareDownload(const BranchSet& branchSet) { ChangeList& cl = *this; - ThreadPool::GetSingleton()->AddJob([&cl](P4API* p4) + ThreadPool::GetSingleton()->AddJob([&cl, &branchSet](P4API* p4) { - const DescribeResult& describe = p4->Describe(cl.number); - cl.changedFiles = std::move(describe.GetFileData()); + std::vector changedFiles; + if (branchSet.HasMergeableBranch()) + { + // If we care about branches, we need to run filelog to get where the file came from. + // Note that the filelog won't include the source changelist, but + // that doesn't give us too much information; even a full branch + // copy will have the target files listing the from-file with + // different changelists than the point-in-time source branch's + // changelist. + const FileLogResult& filelog = p4->FileLog(cl.number); + cl.changedFileGroups = branchSet.ParseAffectedFiles(filelog.GetFileData()); + } + else + { + // If we don't care about branches, then p4->Describe is much faster. + const DescribeResult& describe = p4->Describe(cl.number); + cl.changedFileGroups = branchSet.ParseAffectedFiles(describe.GetFileData()); + } + { std::unique_lock lock((*(cl.canDownloadMutex))); *cl.canDownload = true; @@ -37,11 +56,11 @@ void ChangeList::PrepareDownload() }); } -void ChangeList::StartDownload(const std::string& depotPath, const int& printBatch, const bool includeBinaries) +void ChangeList::StartDownload(const int& printBatch) { ChangeList& cl = *this; - ThreadPool::GetSingleton()->AddJob([&cl, &depotPath, printBatch, includeBinaries](P4API* p4) + ThreadPool::GetSingleton()->AddJob([&cl, printBatch](P4API* p4) { // Wait for describe to finish, if it is still running { @@ -52,50 +71,40 @@ void ChangeList::StartDownload(const std::string& depotPath, const int& printBat *cl.filesDownloaded = 0; - if (cl.changedFiles.empty()) - { - return; - } - std::shared_ptr> printBatchFiles = std::make_shared>(); std::shared_ptr> printBatchFileData = std::make_shared>(); - - for (int i = 0; i < cl.changedFiles.size(); i++) + // Only perform the group inspection if there are files. + if (cl.changedFileGroups->totalFileCount > 0) { - FileData& fileData = cl.changedFiles[i]; - if (p4->IsFileUnderDepotPath(fileData.depotFile, depotPath) - && p4->IsFileUnderClientSpec(fileData.depotFile) - && (includeBinaries || !p4->IsBinary(fileData.type)) - && !STDHelpers::Contains(fileData.depotFile, "/.git/") // To avoid adding .git/ files in the Perforce history if any - && !STDHelpers::EndsWith(fileData.depotFile, "/.git")) // To avoid adding a .git submodule file in the Perforce history if any + for (auto& branchedFileGroup : cl.changedFileGroups->branchedFileGroups) { - fileData.shouldCommit = true; - printBatchFiles->push_back(fileData.depotFile + "#" + fileData.revision); - printBatchFileData->push_back(&fileData); - } - else - { - (*cl.filesDownloaded)++; - cl.commitCV->notify_all(); - } - - // Clear the batches if it fits - if (printBatchFiles->size() == printBatch) - { - cl.Flush(printBatchFiles, printBatchFileData); - - // We let go of the refs held by us and create new ones to queue the next batch - printBatchFiles = std::make_shared>(); - printBatchFileData = std::make_shared>(); - // Now only the thread job has access to the older batch + // Note: the files at this point have already been filtered. + for (auto& fileData : branchedFileGroup.files) + { + if (fileData.IsDownloadNeeded()) + { + fileData.SetPendingDownload(); + printBatchFiles->push_back(fileData.GetDepotFile() + "#" + fileData.GetRevision()); + printBatchFileData->push_back(&fileData); + + // Clear the batches if it fits + if (printBatchFiles->size() == printBatch) + { + cl.Flush(printBatchFiles, printBatchFileData); + + // We let go of the refs held by us and create new ones to queue the next batch + printBatchFiles = std::make_shared>(); + printBatchFileData = std::make_shared>(); + // Now only the thread job has access to the older batch + } + } + } } } - // Flush any remaining files that were smaller in number than the total batch size - if (!printBatchFiles->empty()) - { - cl.Flush(printBatchFiles, printBatchFileData); - } + // Flush any remaining files that were smaller in number than the total batch size. + // Additionally, signal the batch processing end. + cl.Flush(printBatchFiles, printBatchFileData); }); } @@ -104,15 +113,20 @@ void ChangeList::Flush(std::shared_ptr> printBatchFiles // Share ownership of this batch with the thread job ThreadPool::GetSingleton()->AddJob([this, printBatchFiles, printBatchFileData](P4API* p4) { - const PrintResult& printData = p4->PrintFiles(*printBatchFiles); - - for (int i = 0; i < printBatchFiles->size(); i++) + // Only perform the batch processing when there are files to process. + if (!printBatchFileData->empty()) { - printBatchFileData->at(i)->contents = std::move(printData.GetPrintData().at(i).contents); - } + const PrintResult& printData = p4->PrintFiles(*printBatchFiles); - (*filesDownloaded) += printBatchFiles->size(); + for (int i = 0; i < printBatchFiles->size(); i++) + { + printBatchFileData->at(i)->MoveContentsOnceFrom(printData.GetPrintData().at(i).contents); + } + + (*filesDownloaded) += printBatchFiles->size(); + } + // Ensure the notify_all is called. commitCV->notify_all(); }); } @@ -121,7 +135,7 @@ void ChangeList::WaitForDownload() { std::unique_lock lock(*commitMutex); commitCV->wait(lock, [this]() - { return *(filesDownloaded) == (int)changedFiles.size(); }); + { return *(filesDownloaded) == (int)changedFileGroups->totalFileCount; }); } void ChangeList::Clear() @@ -129,7 +143,7 @@ void ChangeList::Clear() number.clear(); user.clear(); description.clear(); - changedFiles.clear(); + changedFileGroups->Clear(); filesDownloaded.reset(); canDownload.reset(); diff --git a/p4-fusion/commands/change_list.h b/p4-fusion/commands/change_list.h index 8e1d9851..d1bda7a7 100644 --- a/p4-fusion/commands/change_list.h +++ b/p4-fusion/commands/change_list.h @@ -12,7 +12,7 @@ #include #include "common.h" -#include "file_data.h" +#include "../branch_set.h" struct ChangeList { @@ -20,7 +20,7 @@ struct ChangeList std::string user; std::string description; int64_t timestamp = 0; - std::vector changedFiles; + std::unique_ptr changedFileGroups = ChangedFileGroups::Empty(); std::shared_ptr> filesDownloaded = std::make_shared>(-1); std::shared_ptr> canDownload = std::make_shared>(false); @@ -37,8 +37,8 @@ struct ChangeList ChangeList& operator=(ChangeList&&) = default; ~ChangeList() = default; - void PrepareDownload(); - void StartDownload(const std::string& depotPath, const int& printBatch, const bool includeBinaries); + void PrepareDownload(const BranchSet& branchSet); + void StartDownload(const int& printBatch); void Flush(std::shared_ptr> printBatchFiles, std::shared_ptr> printBatchFileData); void WaitForDownload(); void Clear(); diff --git a/p4-fusion/commands/changes_result.h b/p4-fusion/commands/changes_result.h index 363516e4..9159bf73 100644 --- a/p4-fusion/commands/changes_result.h +++ b/p4-fusion/commands/changes_result.h @@ -7,10 +7,6 @@ #pragma once #include -#include -#include -#include -#include #include "common.h" diff --git a/p4-fusion/commands/client_result.h b/p4-fusion/commands/client_result.h index 1b610079..d65fa6d6 100644 --- a/p4-fusion/commands/client_result.h +++ b/p4-fusion/commands/client_result.h @@ -11,7 +11,6 @@ #include "common.h" #include "result.h" -#include "p4/mapapi.h" class ClientResult : public Result { diff --git a/p4-fusion/commands/describe_result.cc b/p4-fusion/commands/describe_result.cc index 9ecc95da..e1c30342 100644 --- a/p4-fusion/commands/describe_result.cc +++ b/p4-fusion/commands/describe_result.cc @@ -20,17 +20,12 @@ int DescribeResult::OutputStatPartial(StrDict* varList) // Quick exit if the object returned is not a file return 0; } - StrPtr* type = varList->GetVar(("type" + indexString).c_str()); - StrPtr* revision = varList->GetVar(("rev" + indexString).c_str()); - StrPtr* action = varList->GetVar(("action" + indexString).c_str()); + std::string depotFileStr = depotFile->Text(); + std::string type = varList->GetVar(("type" + indexString).c_str())->Text(); + std::string revision = varList->GetVar(("rev" + indexString).c_str())->Text(); + std::string action = varList->GetVar(("action" + indexString).c_str())->Text(); - m_FileData.push_back(FileData {}); - FileData* fileData = &m_FileData.back(); - - fileData->depotFile = depotFile->Text(); - fileData->revision = revision->Text(); - fileData->type = type->Text(); - fileData->action = action->Text(); + m_FileData.push_back(FileData(depotFileStr, revision, action, type)); return 1; } diff --git a/p4-fusion/commands/file_data.cc b/p4-fusion/commands/file_data.cc new file mode 100644 index 00000000..4c8ff762 --- /dev/null +++ b/p4-fusion/commands/file_data.cc @@ -0,0 +1,210 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ +#include "file_data.h" + +FileDataStore::FileDataStore() + : actionCategory(FileAction::FileAdd) + , isContentsSet(false) + , isContentsPendingDownload(false) +{ +} + +FileData::FileData(std::string& depotFile, std::string& revision, std::string& action, std::string& type) + : m_data(std::make_shared()) +{ + m_data->depotFile = depotFile; + m_data->revision = revision; + m_data->SetAction(action); + m_data->type = type; + m_data->isContentsSet = false; + m_data->isContentsPendingDownload = false; +} + +FileData::FileData(const FileData& copy) + : m_data(copy.m_data) +{ +} + +FileData& FileData::operator=(FileData& other) +{ + if (this == &other) + { + // guard... + return *this; + } + m_data = other.m_data; + return *this; +} + +void FileData::SetFromDepotFile(const std::string& fromDepotFile, const std::string& fromRevision) +{ + m_data->fromDepotFile = fromDepotFile; + if (STDHelpers::StartsWith(fromRevision, "#")) + { + m_data->fromRevision = fromRevision.substr(1); + } + else + { + m_data->fromRevision = fromRevision; + } +} + +void FileData::MoveContentsOnceFrom(const std::vector& contents) +{ + // TODO double-check the thread logic here. It needs to be thread safe. + + if (m_data->isContentsSet) + { + // Do not set the contents. Assume that + // they were already set or, worst case, are currently being set. + return; + } + m_data->isContentsSet = true; + m_data->contents = std::move(contents); + m_data->isContentsPendingDownload = false; +} + +void FileData::SetPendingDownload() +{ + if (!m_data->isContentsSet) + { + m_data->isContentsPendingDownload = true; + } +} + +void FileData::SetRelativePath(std::string& relativePath) +{ + m_data->relativePath = relativePath; +} + +bool FileData::IsBinary() const +{ + return STDHelpers::Contains(m_data->type, "binary"); +} + +bool FileData::IsExecutable() const +{ + return STDHelpers::Contains(m_data->type, "+x"); +} + +FileAction extrapolateFileAction(std::string& action); + +void FileDataStore::SetAction(std::string fileAction) +{ + action = fileAction; + actionCategory = extrapolateFileAction(fileAction); + switch (actionCategory) + { + case FileAction::FileBranch: + case FileAction::FileMoveAdd: + case FileAction::FileIntegrate: + case FileAction::FileImport: + isIntegrated = true; + isDeleted = false; + break; + + case FileAction::FileDelete: + case FileAction::FileMoveDelete: + case FileAction::FilePurge: + // Note: not including FileAction::FileArchive + isDeleted = true; + isIntegrated = false; + break; + + case FileAction::FileIntegrateDelete: + // This is the source of the integration, + // so even though this causes a delete to happen, + // as a source, there isn't something merging into this + // change. + isIntegrated = false; + isDeleted = true; + break; + + default: + isIntegrated = false; + isDeleted = false; + } +} + +void FileDataStore::Clear() +{ + depotFile.clear(); + revision.clear(); + action.clear(); + type.clear(); + fromDepotFile.clear(); + fromRevision.clear(); + contents.clear(); + relativePath.clear(); +} + +FileAction extrapolateFileAction(std::string& action) +{ + if ("add" == action) + { + return FileAction::FileAdd; + } + if ("edit" == action) + { + return FileAction::FileEdit; + } + if ("delete" == action) + { + return FileAction::FileDelete; + } + if ("branch" == action) + { + return FileAction::FileBranch; + } + if ("move/add" == action) + { + return FileAction::FileMoveAdd; + } + if ("move/delete" == action) + { + return FileAction::FileMoveDelete; + } + if ("integrate" == action) + { + return FileAction::FileIntegrate; + } + if ("import" == action) + { + return FileAction::FileImport; + } + if ("purge" == action) + { + return FileAction::FilePurge; + } + if ("archive" == action) + { + return FileAction::FileArchive; + } + if (FAKE_INTEGRATION_DELETE_ACTION_NAME == action) + { + return FileAction::FileIntegrateDelete; + } + + // That's all the actions known at the time of writing. + // An unknown type, probably some future Perforce version with a new kind of action. + if (STDHelpers::Contains(action, "delete")) + { + // Looks like a delete. + WARN("Found an unsupported action " << action << "; assuming delete"); + return FileAction::FileDelete; + } + if (STDHelpers::Contains(action, "move/")) + { + // Looks like a new kind of integrate. + WARN("Found an unsupported action " << action << "; assuming move/add"); + return FileAction::FileMoveAdd; + } + + // assume an edit, as it's the safe bet. + WARN("Found an unsupported action " << action << "; assuming edit"); + return FileAction::FileEdit; +} diff --git a/p4-fusion/commands/file_data.h b/p4-fusion/commands/file_data.h index 83546702..82e4781d 100644 --- a/p4-fusion/commands/file_data.h +++ b/p4-fusion/commands/file_data.h @@ -6,23 +6,97 @@ */ #pragma once +#include +#include #include "common.h" +#include "utils/std_helpers.h" -struct FileData +#define FAKE_INTEGRATION_DELETE_ACTION_NAME "FAKE merge delete" + +// See https://www.perforce.com/manuals/cmdref/Content/CmdRef/p4_fstat.html +// for a list of actions. +enum FileAction +{ + FileAdd, // add + FileEdit, // edit + FileDelete, // delete + FileBranch, // branch + FileMoveAdd, // move/add + FileMoveDelete, // move/delete + FileIntegrate, // integrate + FileImport, // import + FilePurge, // purge + FileArchive, // archive + + FileIntegrateDelete, // artificial action to reflect an integration that happened that caused a delete +}; + +struct FileDataStore { + // describe/filelog values std::string depotFile; std::string revision; std::string action; std::string type; + + // filelog values + // - empty if not an integration style change + std::string fromDepotFile; + std::string fromRevision; + + // print values + // the "is*" values here are intended to put the + // breaks on possible multi-threaded downloads. std::vector contents; - bool shouldCommit = false; - - void Clear() - { - depotFile.clear(); - revision.clear(); - action.clear(); - type.clear(); - contents.clear(); - } + std::atomic isContentsSet; + std::atomic isContentsPendingDownload; + + // Derived Values + std::string relativePath; + FileAction actionCategory; + bool isDeleted; + bool isIntegrated; // ... or copied, or moved, or ... + + FileDataStore(); + + void SetAction(std::string action); + + void Clear(); +}; + +// For memory efficiency; the underlying data is passed around a bunch. +struct FileData +{ +private: + std::shared_ptr m_data; + +public: + FileData(std::string& depotFile, std::string& revision, std::string& action, std::string& type); + FileData(const FileData& copy); + FileData& operator=(FileData& other); + + void SetFromDepotFile(const std::string& fromDepotFile, const std::string& fromRevision); + void SetRelativePath(std::string& relativePath); + void SetFakeIntegrationDeleteAction() { m_data->SetAction(FAKE_INTEGRATION_DELETE_ACTION_NAME); }; + + // moves the argument's data into this file data structure. + void MoveContentsOnceFrom(const std::vector& contents); + void SetPendingDownload(); + bool IsDownloadNeeded() const { return !m_data->isContentsSet && !m_data->isContentsPendingDownload; }; + bool IsReady() const { return m_data->isContentsSet; } + + const std::string& GetDepotFile() const { return m_data->depotFile; }; + const std::string& GetRevision() const { return m_data->revision; }; + const FileAction GetAction() const { return m_data->actionCategory; }; + const std::string& GetRelativePath() const { return m_data->relativePath; }; + const std::vector& GetContents() const { return m_data->contents; }; + bool IsDeleted() const { return m_data->isDeleted; }; + bool IsIntegrated() const { return m_data->isIntegrated; }; + std::string& GetFromDepotFile() const { return m_data->fromDepotFile; }; + std::string& GetFromRevision() const { return m_data->fromRevision; }; + + bool IsBinary() const; + bool IsExecutable() const; + + void Clear() { m_data->Clear(); }; }; diff --git a/p4-fusion/commands/file_map.cc b/p4-fusion/commands/file_map.cc new file mode 100644 index 00000000..517c85b3 --- /dev/null +++ b/p4-fusion/commands/file_map.cc @@ -0,0 +1,192 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ +#include "file_map.h" + +FileMap::FileMap() + : m_sensitivity(MapCase::Sensitive) +{ + // The constructor line + this m_map set are the equivalent of calling SetCaseSensitivity locally. + m_map.SetCaseSensitivity(m_sensitivity); +} + +FileMap::FileMap(const FileMap& src) + : m_sensitivity(src.GetCaseSensitivity()) +{ + // This call sets the case sensitivity of m_map. + src.copyMapApiInto(m_map); +} + +bool FileMap::IsInLeft(const std::string fileRevision) const +{ + MapApi argMap; + argMap.SetCaseSensitivity(m_sensitivity); + argMap.Insert(StrBuf(fileRevision.c_str()), MapType::MapInclude); + + // MapAPI is poorly written and doesn't declare things as const when it should. + return MapApi::Join(const_cast(&m_map), &argMap) != nullptr; +} + +bool FileMap::IsInRight(const std::string fileRevision) const +{ + StrBuf to; + StrBuf from(fileRevision.c_str()); + + // MapAPI is poorly written and doesn't declare things as const when it should. + MapApi* ref = const_cast(&m_map); + return ref->Translate(from, to); +} + +void FileMap::SetCaseSensitivity(const MapCase mode) +{ + m_sensitivity = mode; + m_map.SetCaseSensitivity(mode); +} + +std::string FileMap::TranslateLeftToRight(const std::string& path) const +{ + StrBuf from(path.c_str()); + StrBuf to; + + // MapAPI is poorly written and doesn't declare things as const when it should. + MapApi* ref = const_cast(&m_map); + if (ref->Translate(from, to, MapDir::MapLeftRight)) + { + return to.Text(); + } + return ""; +} + +std::string FileMap::TranslateRightToLeft(const std::string& path) const +{ + StrBuf from(path.c_str()); + StrBuf to; + + // MapAPI is poorly written and doesn't declare things as const when it should. + MapApi* ref = const_cast(&m_map); + if (ref->Translate(from, to, MapDir::MapRightLeft)) + { + return to.Text(); + } + return ""; +} + +void FileMap::insertMapping(const std::string& left, const std::string& right, const MapType mapType) +{ + std::string mapStrLeft = left; + mapStrLeft.erase(mapStrLeft.find_last_not_of(' ') + 1); + mapStrLeft.erase(0, mapStrLeft.find_first_not_of(' ')); + + std::string mapStrRight = right; + mapStrRight.erase(mapStrRight.find_last_not_of(' ') + 1); + mapStrRight.erase(0, mapStrRight.find_first_not_of(' ')); + + m_map.Insert(StrBuf(mapStrLeft.c_str()), StrBuf(mapStrRight.c_str()), mapType); +} + +void FileMap::InsertTranslationMapping(const std::vector& mapping) +{ + for (int i = 0; i < mapping.size(); i++) + { + const std::string& view = mapping.at(i); + + size_t left = view.find('/'); + + MapType mapType = MapType::MapInclude; + switch (view.front()) + { + case '+': + mapType = MapType::MapOverlay; + break; + case '-': + mapType = MapType::MapExclude; + break; + case '&': + mapType = MapType::MapOneToMany; + break; + } + + // TODO This also needs quote handling + + // Skip the first few characters to only match with the right half. + size_t right = view.find("//", 3); + if (right == std::string::npos) + { + WARN("Found a one-sided mapping, ignoring..."); + continue; + } + + insertMapping(view.substr(left, right), view.substr(right), mapType); + } +} + +void FileMap::InsertPaths(const std::vector& paths) +{ + for (int i = 0; i < paths.size(); i++) + { + const std::string& view = paths.at(i); + insertMapping(view, view, MapType::MapInclude); + } +} + +void FileMap::InsertFileMap(const FileMap& src) +{ + src.copyMapApiInto(m_map); +} + +const std::vector PATH_PREFIX_DESCRIPTIONS = { + // Order is important. + "share ", + "isolate ", + "import+ ", + "import ", + "exclude " +}; +const int PATH_PREFIX_DESCRIPTION_COUNT = 5; +const int PATH_PREFIX_DESCRIPTION_EXCLUDE_INDEX_START = 4; + +void FileMap::InsertPrefixedPaths(const std::string prefix, const std::vector& paths) +{ + for (int i = 0; i < paths.size(); i++) + { + MapType mapType = MapType::MapInclude; + std::string view = paths.at(i); + + // Some paths, such as the Stream spec, can include a prefix. + for (int i = 0; i < PATH_PREFIX_DESCRIPTION_COUNT; i++) + { + size_t match = view.find(PATH_PREFIX_DESCRIPTIONS[i]); + if (match == 0) + { + if (i >= PATH_PREFIX_DESCRIPTION_EXCLUDE_INDEX_START) + { + mapType = MapType::MapExclude; + } + view.erase(PATH_PREFIX_DESCRIPTIONS[i].size()); + break; + } + } + + view = prefix + view; + insertMapping(view, view, mapType); + } +} + +void FileMap::copyMapApiInto(MapApi& map) const +{ + // MapAPI is poorly written and doesn't declare things as const when it should. + MapApi* ref = const_cast(&m_map); + + map.Clear(); + map.SetCaseSensitivity(m_sensitivity); + for (int i = 0; i < ref->Count(); i++) + { + map.Insert( + StrBuf(ref->GetLeft(i)->Text()), + StrBuf(ref->GetRight(i)->Text()), + ref->GetType(i)); + } +} diff --git a/p4-fusion/commands/file_map.h b/p4-fusion/commands/file_map.h new file mode 100644 index 00000000..303e558c --- /dev/null +++ b/p4-fusion/commands/file_map.h @@ -0,0 +1,52 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ + +#pragma once + +#include +#include + +#include "common.h" + +struct FileMap +{ +private: + MapApi m_map; + MapCase m_sensitivity; + + void insertMapping(const std::string& left, const std::string& right, const MapType mapType); + void copyMapApiInto(MapApi& map) const; + +public: + FileMap(); + FileMap(const FileMap& src); + + bool IsInLeft(const std::string fileRevision) const; + bool IsInRight(const std::string fileRevision) const; + + void SetCaseSensitivity(const MapCase mode); + MapCase GetCaseSensitivity() const { return m_sensitivity; }; + + // TranslateLeftToRight turn a left to a right. + // Returns an empty string if the mapping is invalid. + std::string TranslateLeftToRight(const std::string& path) const; + + // TranslateRightToLeft turn a right to a left. + // Returns an empty string if the mapping is invalid. + std::string TranslateRightToLeft(const std::string& path) const; + + // "//a/... b/..." format + void InsertTranslationMapping(const std::vector& mapping); + + // "//a/..." format + void InsertPaths(const std::vector& paths); + + // "..." format + void InsertPrefixedPaths(const std::string prefix, const std::vector& paths); + + void InsertFileMap(const FileMap& src); +}; diff --git a/p4-fusion/commands/filelog_result.cc b/p4-fusion/commands/filelog_result.cc new file mode 100644 index 00000000..47e2edb3 --- /dev/null +++ b/p4-fusion/commands/filelog_result.cc @@ -0,0 +1,80 @@ +/* + * Copyright (c) 2022 Salesforce, Inc. + * All rights reserved. + * SPDX-License-Identifier: BSD-3-Clause + * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ +#include "filelog_result.h" + +// Should be called once per varlist. Each filelog file +// is its own entry. +void FileLogResult::OutputStat(StrDict* varList) +{ + StrPtr* depotFile = varList->GetVar("depotFile"); + if (!depotFile) + { + // Quick exit if the object returned is not a file + return; + } + std::string depotFileStr = depotFile->Text(); + // Only get the first record... + // std::string changelist = varList->GetVar(("change0").c_str())->Text(); + std::string type = varList->GetVar("type0")->Text(); + std::string revision = varList->GetVar("rev0")->Text(); + std::string action = varList->GetVar("action0")->Text(); + + m_FileData.push_back(FileData(depotFileStr, revision, action, type)); + FileData& fileData = m_FileData.back(); + + // Could optimize here by only performing this loop if the action type is + // an integration style action (entry->isIntegration == true). + // That needs testing, though. + int i = 0; + StrPtr* how = nullptr; + while (true) + { + std::string indexString = std::to_string(i++); + how = varList->GetVar(("how0," + indexString).c_str()); + + if (!how) + { + break; + } + + std::string howStr = how->Text(); + + // How text values listed at: + // https://www.perforce.com/manuals/cmdref/Content/CmdRef/p4_integrated.html + + // "* into" - integrated to another location from the current depot file. + // This tool doesn't care about this action. These are ignored. + // "* from" - integrated into this depot file from another location. Definitely care about these. + // (* here is "add", "merge", "branch", "moved", "copy", "delete", "edit") + // "Add w/ Edit", "Merge w/ Edit" - an integrate + an edit on top of the merge. + // (Add - it hasn't existed before in the target; Merge - it already existed there and is being edited) + // This one is seen often in Java move operations between trees when the "package" line needs to + // change with the move. This rarely happens cross-branch, and when it does, it's not + // really a merge operation. + // "undid" - a "revert changelist" action from a previous revision of the same file (target of p4 undo) + // "undone by" - the "* into" concept for "undid" (source of p4 undo) + // "Undone w/Edit" + + if (howStr == "delete from") + { + // The action needs to be marked as something very clearly a delete. + // See file_data.h and file_data.cc for this special replaced action. + fileData.SetFakeIntegrationDeleteAction(); + } + + if (STDHelpers::EndsWith(howStr, " from")) + { + // copy or integrate or branch or move or archive from a location. + std::string fromDepotFile = varList->GetVar(("file0," + indexString).c_str())->Text(); + std::string fromRev = varList->GetVar(("erev0," + indexString).c_str())->Text(); + fileData.SetFromDepotFile(fromDepotFile, fromRev); + + // Don't look for any other integration history; there can (should?) be at most one. + break; + } + } +} diff --git a/p4-fusion/commands/files_result.h b/p4-fusion/commands/filelog_result.h similarity index 50% rename from p4-fusion/commands/files_result.h rename to p4-fusion/commands/filelog_result.h index a9286539..ed01465d 100644 --- a/p4-fusion/commands/files_result.h +++ b/p4-fusion/commands/filelog_result.h @@ -6,28 +6,23 @@ */ #pragma once +#include +#include + #include "common.h" #include "result.h" +#include "file_data.h" +#include "utils/std_helpers.h" -class FilesResult : public Result +// Very limited to just a single file log entry per file. +class FileLogResult : public Result { -public: - struct FileData - { - std::string depotFile; - std::string revision; - std::string change; - std::string action; - std::string type; - std::string time; - std::string size; - }; - private: - std::vector m_Files; + std::vector m_FileData; public: - std::vector& GetFilesResult() { return m_Files; } + const std::vector& GetFileData() const { return m_FileData; } void OutputStat(StrDict* varList) override; + // int OutputStatPartial(StrDict* varList) override; }; diff --git a/p4-fusion/commands/files_result.cc b/p4-fusion/commands/files_result.cc deleted file mode 100644 index 55b80c4d..00000000 --- a/p4-fusion/commands/files_result.cc +++ /dev/null @@ -1,21 +0,0 @@ -/* - * Copyright (c) 2022 Salesforce, Inc. - * All rights reserved. - * SPDX-License-Identifier: BSD-3-Clause - * For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause - */ -#include "files_result.h" - -void FilesResult::OutputStat(StrDict* varList) -{ - FileData data; - - data.depotFile = varList->GetVar("depotFile")->Text(); - data.revision = varList->GetVar("rev")->Text(); - data.change = varList->GetVar("change")->Text(); - data.action = varList->GetVar("action")->Text(); - data.time = varList->GetVar("time")->Text(); - data.type = varList->GetVar("type")->Text(); - - m_Files.push_back(data); -} diff --git a/p4-fusion/common.h b/p4-fusion/common.h index 189cccee..a52cb88c 100644 --- a/p4-fusion/common.h +++ b/p4-fusion/common.h @@ -13,4 +13,8 @@ #include "p4/clientapi.h" +// This is poorly written and must be included only once, +// so it's tucked in here, which has the proper 'once' pragma. +#include "p4/mapapi.h" + #include "log.h" diff --git a/p4-fusion/git_api.cc b/p4-fusion/git_api.cc index 2e0a2d07..9234553c 100644 --- a/p4-fusion/git_api.cc +++ b/p4-fusion/git_api.cc @@ -84,6 +84,54 @@ bool GitAPI::IsHEADExists() return errorCode == 0; } +void GitAPI::SetActiveBranch(const std::string& branchName) +{ + if (branchName == m_CurrentBranch) + { + return; + } + + int errorCode; + // Look up the branch. + git_reference* branch; + errorCode = git_reference_lookup(&branch, m_Repo, ("refs/heads/" + branchName).c_str()); + if (errorCode != 0 && errorCode != GIT_ENOTFOUND) + { + GIT2(errorCode); + } + if (errorCode == GIT_ENOTFOUND) + { + // Create the branch from the first index. + git_commit* firstCommit = nullptr; + GIT2(git_commit_lookup(&firstCommit, m_Repo, &m_FirstCommitOid)); + GIT2(git_branch_create(&branch, m_Repo, branchName.c_str(), firstCommit, 0)); + git_commit_free(firstCommit); + } + + // Make head point to the branch. + git_reference* head; + GIT2(git_reference_symbolic_create(&head, m_Repo, "HEAD", git_reference_name(branch), 1, branchName.c_str())); + git_reference_free(head); + git_reference_free(branch); + + // Now update the files in the index to match the content of the commit pointed at by HEAD. + git_oid oidParentCommit; + GIT2(git_reference_name_to_id(&oidParentCommit, m_Repo, "HEAD")); + + git_commit* headCommit = nullptr; + GIT2(git_commit_lookup(&headCommit, m_Repo, &oidParentCommit)); + + git_tree* headCommitTree = nullptr; + GIT2(git_commit_tree(&headCommitTree, headCommit)); + + GIT2(git_index_read_tree(m_Index, headCommitTree)); + + git_tree_free(headCommitTree); + git_commit_free(headCommit); + + m_CurrentBranch = branchName; +} + git_oid GitAPI::CreateBlob(const std::vector& data) { git_oid oid; @@ -100,8 +148,12 @@ std::string GitAPI::DetectLatestCL() GIT2(git_commit_lookup(&headCommit, m_Repo, &oid)); std::string message = git_commit_message(headCommit); - size_t clStart = message.find_last_of("change = ") + 1; - std::string cl(message.begin() + clStart, message.end() - 1); + // Look for the specific change message generated from the Commit method. + // Note that extra branching information can be added after it. + // ": change = " is 11 characters long. + size_t clStart = message.rfind(": change = ") + 11; + size_t clEnd = message.find(']', clStart); + std::string cl(message, clStart, clEnd - clStart); git_commit_free(headCommit); @@ -130,15 +182,45 @@ void GitAPI::CreateIndex() git_tree_free(head_commit_tree); git_commit_free(head_commit); + // Find the first commit + git_revwalk* walk; + git_revwalk_new(&walk, m_Repo); + git_revwalk_sorting(walk, GIT_SORT_TOPOLOGICAL); + git_revwalk_push_head(walk); + git_revwalk_next(&m_FirstCommitOid, walk); + WARN("Loaded index was refreshed to match the tree of the current HEAD commit"); } else { - WARN("No HEAD commit was found. Created a fresh index."); + // In order to have branches be mergable, even with no shared history, we perform + // a trick by adding an empty commit as the very first commit, and use this as the base for all branches. + // The time is set to the beginning of time. + git_oid commitTreeID; + GIT2(git_index_write_tree_to(&commitTreeID, m_Index, m_Repo)); + + git_tree* commitTree = nullptr; + GIT2(git_tree_lookup(&commitTree, m_Repo, &commitTreeID)); + + git_signature* author = nullptr; + GIT2(git_signature_new(&author, "No User", "no@user", 0, 0)); + + git_reference* ref = nullptr; + git_object* parent = nullptr; + git_revparse_ext(&parent, &ref, m_Repo, "HEAD"); + + GIT2(git_commit_create_v(&m_FirstCommitOid, m_Repo, "HEAD", author, author, "UTF-8", "Initial repository.", commitTree, parent ? 1 : 0, parent)); + + git_object_free(parent); + git_reference_free(ref); + git_signature_free(author); + git_tree_free(commitTree); + + WARN("No HEAD commit was found. Created fresh index " << git_oid_tostr_s(&m_FirstCommitOid) << "."); } } -void GitAPI::AddFileToIndex(const std::string& depotPath, const std::string& depotFile, const std::vector& contents, const bool plusx) +void GitAPI::AddFileToIndex(const std::string& relativePath, const std::vector& contents, const bool plusx) { MTR_SCOPE("Git", __func__); @@ -149,24 +231,16 @@ void GitAPI::AddFileToIndex(const std::string& depotPath, const std::string& dep entry.mode = GIT_FILEMODE_BLOB_EXECUTABLE; // 0100755 } - std::string depotPathTrunc = depotPath.substr(0, depotPath.size() - 3); // -3 to remove trailing ... - std::string gitFilePath = depotFile; - STDHelpers::Erase(gitFilePath, depotPathTrunc); - - entry.path = gitFilePath.c_str(); + entry.path = relativePath.c_str(); GIT2(git_index_add_from_buffer(m_Index, &entry, contents.data(), contents.size())); } -void GitAPI::RemoveFileFromIndex(const std::string& depotPath, const std::string& depotFile) +void GitAPI::RemoveFileFromIndex(const std::string& relativePath) { MTR_SCOPE("Git", __func__); - std::string depotPathTrunc = depotPath.substr(0, depotPath.size() - 3); // -3 to remove trailing ... - std::string gitFilePath = depotFile; - STDHelpers::Erase(gitFilePath, depotPathTrunc); - - GIT2(git_index_remove_bypath(m_Index, gitFilePath.c_str())); + GIT2(git_index_remove_bypath(m_Index, relativePath.c_str())); } std::string GitAPI::Commit( @@ -176,7 +250,8 @@ std::string GitAPI::Commit( const std::string& email, const int& timezone, const std::string& desc, - const int64_t& timestamp) + const int64_t& timestamp, + const std::string& mergeFromStream) { MTR_SCOPE("Git", __func__); @@ -189,23 +264,51 @@ std::string GitAPI::Commit( git_signature* author = nullptr; GIT2(git_signature_new(&author, user.c_str(), email.c_str(), timestamp, timezone)); - git_reference* ref = nullptr; - git_object* parent = nullptr; + // -3 to remove the trailing "..." + std::string commitMsg = cl + " - " + desc + "\n[p4-fusion: depot-paths = \"" + depotPath.substr(0, depotPath.size() - 3) + "\": change = " + cl + "]"; + + // Find the parent commits. + // Order is very important. + std::vector parentRefs = { "HEAD" }; + if (!mergeFromStream.empty()) { - int error = git_revparse_ext(&parent, &ref, m_Repo, "HEAD"); - if (error == GIT_ENOTFOUND) + parentRefs.push_back("refs/heads/" + mergeFromStream); + } + + git_commit* parents[2]; + int parentCount = 0; + for (std::string& parentRef : parentRefs) + { + git_oid refOid; + int errorCode = git_reference_name_to_id(&refOid, m_Repo, parentRef.c_str()); + if (errorCode != 0 && errorCode != GIT_ENOTFOUND) + { + GIT2(errorCode); + } + else if (errorCode != GIT_ENOTFOUND) { - WARN("GitAPI: HEAD not found. Creating first commit"); + git_commit* refCommit = nullptr; + GIT2(git_commit_lookup(&refCommit, m_Repo, &refOid)); + if (parentCount > 0) + { + commitMsg += "; merged from " + parentRef; + } + parents[parentCount++] = refCommit; } + // Skip the reference if it wasn't found. That means it doesn't + // exist yet, which means there wasn't a previous commit for it. + // Practically speaking, this should only happen in non-branching + // mode on the first commit. } git_oid commitID; - // -3 to remove the trailing "..." - std::string commitMsg = cl + " - " + desc + "\n[p4-fusion: depot-paths = \"" + depotPath.substr(0, depotPath.size() - 3) + "\": change = " + cl + "]"; - GIT2(git_commit_create_v(&commitID, m_Repo, "HEAD", author, author, "UTF-8", commitMsg.c_str(), commitTree, parent ? 1 : 0, parent)); + GIT2(git_commit_create(&commitID, m_Repo, "HEAD", author, author, "UTF-8", commitMsg.c_str(), commitTree, parentCount, (const git_commit**)parents)); - git_object_free(parent); - git_reference_free(ref); + for (int i = 0; i < parentCount; i++) + { + git_commit_free(parents[i]); + parents[i] = nullptr; + } git_signature_free(author); git_tree_free(commitTree); diff --git a/p4-fusion/git_api.h b/p4-fusion/git_api.h index 7779d77f..7c8874e0 100644 --- a/p4-fusion/git_api.h +++ b/p4-fusion/git_api.h @@ -19,6 +19,9 @@ class GitAPI { git_repository* m_Repo = nullptr; git_index* m_Index = nullptr; + git_oid m_FirstCommitOid; + + std::string m_CurrentBranch = ""; public: GitAPI(bool fsyncEnable); @@ -34,8 +37,10 @@ class GitAPI git_oid CreateBlob(const std::vector& data); void CreateIndex(); - void AddFileToIndex(const std::string& depotPath, const std::string& depotFile, const std::vector& contents, const bool plusx); - void RemoveFileFromIndex(const std::string& depotPath, const std::string& depotFile); + void SetActiveBranch(const std::string& branchName); + void AddFileToIndex(const std::string& relativePath, const std::vector& contents, const bool plusx); + void RemoveFileFromIndex(const std::string& relativePath); + std::string Commit( const std::string& depotPath, const std::string& cl, @@ -43,6 +48,7 @@ class GitAPI const std::string& email, const int& timezone, const std::string& desc, - const int64_t& timestamp); + const int64_t& timestamp, + const std::string& mergeFromStream); void CloseIndex(); }; diff --git a/p4-fusion/main.cc b/p4-fusion/main.cc index 20be3208..1202bb3e 100644 --- a/p4-fusion/main.cc +++ b/p4-fusion/main.cc @@ -20,6 +20,7 @@ #include "thread_pool.h" #include "p4_api.h" #include "git_api.h" +#include "branch_set.h" #include "p4/p4libs.h" #include "minitrace.h" @@ -32,12 +33,14 @@ int Main(int argc, char** argv) { Timer programTimer; - Arguments::GetSingleton()->RequiredParameter("--path", "P4 depot path to convert to a Git repo"); + Arguments::GetSingleton()->RequiredParameter("--path", "P4 depot path to convert to a Git repo. If used with '--branch', this is the base path for the branches."); Arguments::GetSingleton()->RequiredParameter("--src", "Relative path where the git repository should be created. This path should be empty before running p4-fusion for the first time in a directory."); Arguments::GetSingleton()->RequiredParameter("--port", "Specify which P4PORT to use."); Arguments::GetSingleton()->RequiredParameter("--user", "Specify which P4USER to use. Please ensure that the user is logged in."); Arguments::GetSingleton()->RequiredParameter("--client", "Name/path of the client workspace specification."); Arguments::GetSingleton()->RequiredParameter("--lookAhead", "How many CLs in the future, at most, shall we keep downloaded by the time it is to commit them?"); + Arguments::GetSingleton()->OptionalParameterList("--branch", "A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide the git branch alias."); + Arguments::GetSingleton()->OptionalParameter("--noMerge", "false", "Disable performing a Git merge when a Perforce branch integrates (or copies, etc) into another branch."); Arguments::GetSingleton()->OptionalParameter("--networkThreads", std::to_string(std::thread::hardware_concurrency()), "Specify the number of threads in the threadpool for running network calls. Defaults to the number of logical CPUs."); Arguments::GetSingleton()->OptionalParameter("--printBatch", "1", "Specify the p4 print batch size."); Arguments::GetSingleton()->OptionalParameter("--maxChanges", "-1", "Specify the max number of changelists which should be processed in a single run. -1 signifies unlimited range."); @@ -57,18 +60,21 @@ int Main(int argc, char** argv) return 0; } - bool noColor = Arguments::GetSingleton()->GetNoColor() != "false"; + const bool noColor = Arguments::GetSingleton()->GetNoColor() != "false"; if (noColor) { Log::DisableColoredOutput(); } + const bool noMerge = Arguments::GetSingleton()->GetNoMerge() != "false"; + const std::string depotPath = Arguments::GetSingleton()->GetDepotPath(); const std::string srcPath = Arguments::GetSingleton()->GetSourcePath(); const bool fsyncEnable = Arguments::GetSingleton()->GetFsyncEnable() != "false"; const bool includeBinaries = Arguments::GetSingleton()->GetIncludeBinaries() != "false"; const int maxChanges = std::atoi(Arguments::GetSingleton()->GetMaxChanges().c_str()); const int flushRate = std::atoi(Arguments::GetSingleton()->GetFlushRate().c_str()); + const std::vector branchNames = Arguments::GetSingleton()->GetBranches(); PRINT("Running p4-fusion from: " << argv[0]); @@ -151,6 +157,8 @@ int Main(int argc, char** argv) P4API::CommandRefreshThreshold = std::atoi(refreshStr.c_str()); } + BranchSet branchSet(P4API::ClientSpec.mapping, depotPath, branchNames, includeBinaries); + bool profiling = false; #if MTR_ENABLED profiling = true; @@ -171,6 +179,7 @@ int Main(int argc, char** argv) PRINT("Profiling: " << profiling); PRINT("Profiling Flush Rate: " << flushRate); PRINT("No Colored Output: " << noColor); + PRINT("Inspecting " << branchSet.Count() << " branches"); GitAPI git(fsyncEnable); @@ -201,7 +210,7 @@ int Main(int argc, char** argv) PRINT("Requesting changelists to convert from the Perforce server"); - std::vector changes = p4.Changes(depotPath, resumeFromCL, maxChanges).GetChanges(); + std::vector changes = std::move(p4.Changes(depotPath, resumeFromCL, maxChanges).GetChanges()); // Return early if we have no work to do if (changes.empty()) @@ -225,8 +234,8 @@ int Main(int argc, char** argv) { ChangeList& cl = changes.at(currentCL); - // Start gathering changed files with `p4 describe` - cl.PrepareDownload(); + // Start gathering changed files with `p4 describe` or `p4 filelog` + cl.PrepareDownload(branchSet); lastDownloadedCL = currentCL; } @@ -239,7 +248,7 @@ int Main(int argc, char** argv) ChangeList& cl = changes.at(currentCL); // Start running `p4 print` on changed files when the describe is finished - cl.StartDownload(depotPath, printBatch, includeBinaries); + cl.StartDownload(printBatch); } SUCCESS("Queued first " << startupDownloadsCount << " CLs up until CL " << changes.at(lastDownloadedCL).number << " for downloading"); @@ -277,52 +286,82 @@ int Main(int argc, char** argv) // Ensure the files are downloaded before committing them to the repository cl.WaitForDownload(); - for (auto& file : cl.changedFiles) + std::string fullName = cl.user; + std::string email = "deleted@user"; + if (users.find(cl.user) != users.end()) { - if (file.shouldCommit) // If the file survived the filters while being downloaded + fullName = users.at(cl.user).fullName; + email = users.at(cl.user).email; + } + + for (auto& branchGroup : cl.changedFileGroups->branchedFileGroups) + { + if (!branchGroup.targetBranch.empty()) + { + git.SetActiveBranch(branchGroup.targetBranch); + } + + for (auto& file : branchGroup.files) { - if (p4.IsDeleted(file.action)) + if (file.IsDeleted()) { - git.RemoveFileFromIndex(depotPath, file.depotFile); + git.RemoveFileFromIndex(file.GetRelativePath()); } else { - git.AddFileToIndex(depotPath, file.depotFile, file.contents, p4.IsExecutable(file.type)); + git.AddFileToIndex(file.GetRelativePath(), file.GetContents(), file.IsExecutable()); } // No use for keeping the contents in memory once it has been added file.Clear(); } - } - std::string fullName = cl.user; - std::string email = "deleted@user"; - if (users.find(cl.user) != users.end()) - { - fullName = users.at(cl.user).fullName; - email = users.at(cl.user).email; - } - std::string commitSHA = git.Commit(depotPath, - cl.number, - fullName, - email, - timezoneMinutes, - cl.description, - cl.timestamp); + std::string mergeFrom = ""; + if (branchGroup.hasSource && !noMerge) + { + // Only perform merging if the branch group explicitly declares that the change + // has a source, and if the user wants merging. + mergeFrom = branchGroup.sourceBranch; + } + std::string commitSHA = git.Commit(depotPath, + cl.number, + fullName, + email, + timezoneMinutes, + cl.description, + cl.timestamp, + mergeFrom); + + // For scripting/testing purposes... + PRINT("COMMIT:" << commitSHA << ":" << cl.number << ":" << branchGroup.targetBranch << ":"); + SUCCESS( + "CL " << cl.number << " --> Commit " << commitSHA + << " with " << branchGroup.files.size() << " files" + << (branchGroup.targetBranch.empty() + ? "" + : (" to branch " + branchGroup.targetBranch)) + << (branchGroup.sourceBranch.empty() + ? "" + : (" from branch " + branchGroup.sourceBranch)) + << "."); + } SUCCESS( - "CL " << cl.number << " --> Commit " << commitSHA - << " with " << cl.changedFiles.size() << " files (" << i + 1 << "/" << changes.size() << "|" << lastDownloadedCL - (long long)i << "). " - << "Elapsed " << commitTimer.GetTimeS() / 60.0f << " mins. " + "CL " << cl.number << " with " + << cl.changedFileGroups->totalFileCount << " files (" << i + 1 << "/" << changes.size() + << "|" << lastDownloadedCL - (long long)i + << "). Elapsed " << commitTimer.GetTimeS() / 60.0f << " mins. " << ((commitTimer.GetTimeS() / 60.0f) / (float)(i + 1)) * (changes.size() - i - 1) << " mins left."); + // Clear out finished changelist. + cl.Clear(); // Start downloading the CL chronologically after the last CL that was previously downloaded, if there's still some left if (lastDownloadedCL + 1 < changes.size()) { lastDownloadedCL++; ChangeList& downloadCL = changes.at(lastDownloadedCL); - downloadCL.PrepareDownload(); - downloadCL.StartDownload(depotPath, printBatch, includeBinaries); + downloadCL.PrepareDownload(branchSet); + downloadCL.StartDownload(printBatch); } // Occasionally flush the profiling data diff --git a/p4-fusion/p4_api.cc b/p4-fusion/p4_api.cc index b883773b..b16b8122 100644 --- a/p4-fusion/p4_api.cc +++ b/p4-fusion/p4_api.cc @@ -100,32 +100,12 @@ bool P4API::IsFileUnderDepotPath(const std::string& fileRevision, const std::str bool P4API::IsDepotPathUnderClientSpec(const std::string& depotPath) { - MapApi depotMap; - depotMap.Insert(StrBuf(depotPath.c_str()), MapType::MapInclude); - - return MapApi::Join(&m_ClientMapping, &depotMap) != nullptr; + return m_ClientMapping.IsInLeft(depotPath); } bool P4API::IsFileUnderClientSpec(const std::string& fileRevision) { - StrBuf to; - StrBuf from(fileRevision.c_str()); - return m_ClientMapping.Translate(from, to); -} - -bool P4API::IsDeleted(const std::string& action) -{ - return STDHelpers::Contains(action, "delete"); -} - -bool P4API::IsBinary(const std::string& fileType) -{ - return STDHelpers::Contains(fileType, "binary"); -} - -bool P4API::IsExecutable(const std::string& fileType) -{ - return STDHelpers::Contains(fileType, "+x"); + return m_ClientMapping.IsInRight(fileRevision); } bool P4API::CheckErrors(Error& e, StrBuf& msg) @@ -177,44 +157,7 @@ bool P4API::ShutdownLibraries() void P4API::AddClientSpecView(const std::vector& viewStrings) { - for (int i = 0; i < viewStrings.size(); i++) - { - const std::string& view = viewStrings.at(i); - - bool modification = view.front() != '/'; - - MapType mapType = MapType::MapInclude; - switch (view.front()) - { - case '+': - mapType = MapType::MapOverlay; - break; - case '-': - mapType = MapType::MapExclude; - break; - case '&': - mapType = MapType::MapOneToMany; - break; - } - - // Skip the first few characters to only match with the right half. - size_t right = view.find("//", 3); - if (right == std::string::npos) - { - WARN("Found a one-sided client mapping, ignoring..."); - continue; - } - - std::string mapStrLeft = view.substr(0, right).c_str() + modification; - mapStrLeft.erase(mapStrLeft.find_last_not_of(' ') + 1); - mapStrLeft.erase(0, mapStrLeft.find_first_not_of(' ')); - - std::string mapStrRight = view.substr(right).c_str(); - mapStrRight.erase(mapStrRight.find_last_not_of(' ') + 1); - mapStrRight.erase(0, mapStrRight.find_first_not_of(' ')); - - m_ClientMapping.Insert(StrBuf(mapStrLeft.c_str()), StrBuf(mapStrRight.c_str()), mapType); - } + m_ClientMapping.InsertTranslationMapping(viewStrings); } void P4API::UpdateClientSpec() @@ -323,9 +266,14 @@ DescribeResult P4API::Describe(const std::string& cl) cl }); } -FilesResult P4API::Files(const std::string& path) +FileLogResult P4API::FileLog(const std::string& changelist) { - return Run("files", { path }); + return Run("filelog", { + "-c", // restrict output to a single changelist + changelist, + "-m1", // don't get the full history, just the first entry. + "//..." // rather than require the path to be passed in, just list all files. + }); } SizesResult P4API::Size(const std::string& file) diff --git a/p4-fusion/p4_api.h b/p4-fusion/p4_api.h index b8a61c5e..88eafbd3 100644 --- a/p4-fusion/p4_api.h +++ b/p4-fusion/p4_api.h @@ -12,9 +12,10 @@ #include "common.h" +#include "commands/file_map.h" #include "commands/changes_result.h" #include "commands/describe_result.h" -#include "commands/files_result.h" +#include "commands/filelog_result.h" #include "commands/sizes_result.h" #include "commands/sync_result.h" #include "commands/print_result.h" @@ -26,7 +27,7 @@ class P4API { ClientApi m_ClientAPI; - MapApi m_ClientMapping; + FileMap m_ClientMapping; int m_Usage; bool Initialize(); @@ -60,9 +61,6 @@ class P4API bool IsFileUnderDepotPath(const std::string& fileRevision, const std::string& depotPath); bool IsDepotPathUnderClientSpec(const std::string& depotPath); bool IsFileUnderClientSpec(const std::string& fileRevision); - bool IsDeleted(const std::string& action); - bool IsBinary(const std::string& fileType); - bool IsExecutable(const std::string& fileType); void AddClientSpecView(const std::vector& viewStrings); @@ -74,7 +72,7 @@ class P4API ChangesResult LatestChange(const std::string& path); ChangesResult OldestChange(const std::string& path); DescribeResult Describe(const std::string& cl); - FilesResult Files(const std::string& path); + FileLogResult FileLog(const std::string& changelist); SizesResult Size(const std::string& file); Result Sync(); Result Sync(const std::string& path); diff --git a/p4-fusion/utils/arguments.cc b/p4-fusion/utils/arguments.cc index abcfcab8..e24d8e80 100644 --- a/p4-fusion/utils/arguments.cc +++ b/p4-fusion/utils/arguments.cc @@ -16,7 +16,7 @@ void Arguments::Initialize(int argc, char** argv) if (m_Parameters.find(name) != m_Parameters.end()) { - m_Parameters.at(name).value = argv[i + 1]; + m_Parameters.at(name).valueList.push_back(argv[i + 1]); m_Parameters.at(name).isSet = true; } else @@ -36,11 +36,25 @@ std::string Arguments::GetParameter(const std::string& argName) const { if (m_Parameters.find(argName) != m_Parameters.end()) { - return m_Parameters.at(argName).value; + if (m_Parameters.at(argName).valueList.empty()) + { + return ""; + } + // Use the last specified version of the parameter. + return m_Parameters.at(argName).valueList.back(); } return ""; } +std::vector Arguments::GetParameterList(const std::string& argName) const +{ + if (m_Parameters.find(argName) != m_Parameters.end()) + { + return m_Parameters.at(argName).valueList; + } + return {}; +} + bool Arguments::IsValid() const { for (auto& arg : m_Parameters) @@ -70,7 +84,7 @@ std::string Arguments::Help() } else { - text += "\033[93m[Optional, Default is " + (paramData.value.empty() ? "empty" : paramData.value) + "]\033[0m"; + text += "\033[93m[Optional, Default is " + (paramData.valueList.empty() ? "empty" : paramData.valueList.back()) + "]\033[0m"; } text += "\n " + paramData.helpText + "\n\n"; } @@ -83,7 +97,6 @@ void Arguments::RequiredParameter(const std::string& name, const std::string& he ParameterData paramData; paramData.isRequired = true; paramData.isSet = false; - paramData.value = ""; paramData.helpText = helpText; m_Parameters[name] = paramData; @@ -94,7 +107,17 @@ void Arguments::OptionalParameter(const std::string& name, const std::string& de ParameterData paramData; paramData.isRequired = false; paramData.isSet = false; - paramData.value = defaultValue; + paramData.valueList.push_back(defaultValue); + paramData.helpText = helpText; + + m_Parameters[name] = paramData; +} + +void Arguments::OptionalParameterList(const std::string& name, const std::string& helpText) +{ + ParameterData paramData; + paramData.isRequired = false; + paramData.isSet = false; paramData.helpText = helpText; m_Parameters[name] = paramData; diff --git a/p4-fusion/utils/arguments.h b/p4-fusion/utils/arguments.h index 4a30d68c..a18c1a7d 100644 --- a/p4-fusion/utils/arguments.h +++ b/p4-fusion/utils/arguments.h @@ -8,6 +8,7 @@ #include #include +#include #include "common.h" @@ -17,19 +18,21 @@ class Arguments { bool isRequired; bool isSet; - std::string value; + std::vector valueList; std::string helpText; }; std::map m_Parameters; std::string GetParameter(const std::string& argName) const; + std::vector GetParameterList(const std::string& argName) const; public: static Arguments* GetSingleton(); void RequiredParameter(const std::string& name, const std::string& helpText); void OptionalParameter(const std::string& name, const std::string& defaultValue, const std::string& helpText); + void OptionalParameterList(const std::string& name, const std::string& helpText); void Initialize(int argc, char** argv); bool IsValid() const; std::string Help(); @@ -52,4 +55,6 @@ class Arguments std::string GetMaxChanges() const { return GetParameter("--maxChanges"); }; std::string GetFlushRate() const { return GetParameter("--flushRate"); }; std::string GetNoColor() const { return GetParameter("--noColor"); }; + std::string GetNoMerge() const { return GetParameter("--noMerge"); }; + std::vector GetBranches() const { return GetParameterList("--branch"); }; }; diff --git a/p4-fusion/utils/std_helpers.cc b/p4-fusion/utils/std_helpers.cc index cb752474..a3414fc7 100644 --- a/p4-fusion/utils/std_helpers.cc +++ b/p4-fusion/utils/std_helpers.cc @@ -40,3 +40,36 @@ void STDHelpers::Erase(std::string& source, const std::string& subStr) source.erase(source.find(subStr), subStr.size()); } + +void STDHelpers::StripSurrounding(std::string& source, const char c) +{ + const size_t size = source.size(); + size_t start = 0; + size_t end = size; + while (start < end && source[start] == c) + { + start++; + } + while (end - 1 > start && source[end - 1] == c) + { + end--; + } + if (end < size) + { + source.erase(end, size - end); + } + if (start > 0) + { + source.erase(0, start); + } +} + +std::array STDHelpers::SplitAt(const std::string& source, const char c, const size_t startAt) +{ + size_t pos = source.find(c, startAt); + if (pos != std::string::npos && pos < source.size()) + { + return { source.substr(startAt, pos - startAt), source.substr(pos + 1) }; + } + return { source, "" }; +} diff --git a/p4-fusion/utils/std_helpers.h b/p4-fusion/utils/std_helpers.h index 2a1108f7..0e7c581a 100644 --- a/p4-fusion/utils/std_helpers.h +++ b/p4-fusion/utils/std_helpers.h @@ -7,6 +7,7 @@ #pragma once #include +#include class STDHelpers { @@ -15,4 +16,9 @@ class STDHelpers static bool StartsWith(const std::string& str, const std::string& checkStr); static bool Contains(const std::string& str, const std::string& subStr); static void Erase(std::string& source, const std::string& subStr); + static void StripSurrounding(std::string& source, const char c); + + // Split the source into two strings at the first character 'c' after position 'startAt'. The 'c' character is + // not included in the returned strings. Text before the 'startAt' will not be included. + static std::array SplitAt(const std::string& source, const char c, const size_t startAt = 0); }; diff --git a/tests/tests.git.h b/tests/tests.git.h index 14b5fcb4..be8b8a72 100644 --- a/tests/tests.git.h +++ b/tests/tests.git.h @@ -17,7 +17,7 @@ int TestGitAPI() TEST(git.InitializeRepository("/tmp/test-repo"), true); git.CreateIndex(); - git.AddFileToIndex("//a/b/c/...", "//a/b/c/foo.txt", { 'x', 'y', 'z' }, false); + git.AddFileToIndex("foo.txt", { 'x', 'y', 'z' }, false); git.Commit( "//a/b/c/...", "12345678", @@ -25,14 +25,15 @@ int TestGitAPI() "test@user", 0, "Test description", - 10000000); + 10000000, + ""); TEST(git.IsHEADExists(), true); TEST(git.IsRepositoryClonedFrom("//a/b/c/..."), true); TEST(git.IsRepositoryClonedFrom("//a/b/c/d/..."), false); TEST(git.IsRepositoryClonedFrom("//x/y/z/..."), false); TEST(git.DetectLatestCL(), "12345678"); - git.RemoveFileFromIndex("//a/b/c/...", "//a/b/c/foo.txt"); + git.RemoveFileFromIndex("foo.txt"); git.Commit( "//a/b/c/...", "12345679", @@ -40,7 +41,8 @@ int TestGitAPI() "test2@user", 0, "Test description", - 20000000); + 20000000, + ""); TEST(git.IsHEADExists(), true); TEST(git.IsRepositoryClonedFrom("//a/b/c/..."), true); TEST(git.IsRepositoryClonedFrom("//a/b/c/d/..."), false); diff --git a/tests/tests.utils.h b/tests/tests.utils.h index 91606304..b43aedb6 100644 --- a/tests/tests.utils.h +++ b/tests/tests.utils.h @@ -6,6 +6,7 @@ */ #pragma once +#include #include "tests.common.h" #include "utils/std_helpers.h" #include "utils/time_helpers.h" @@ -82,6 +83,17 @@ int TestUtils() TEST(Time::GetTimezoneMinutes("2022/03/09 22:59:04 +0000 GMT"), +0); TEST(Time::GetTimezoneMinutes("2022/03/09 22:59:04 -0000 GMT"), +0); + { + auto actual = STDHelpers::SplitAt("depot/path/some/other", '/'); + auto expected = std::array { "depot", "path/some/other" }; + TEST(actual, expected); + } + { + auto actual = STDHelpers::SplitAt("//depot/path", '/', 2); + auto expected = std::array { "depot", "path" }; + TEST(actual, expected); + } + TEST_END(); return TEST_EXIT_CODE(); } diff --git a/validate-migration.sh b/validate-migration.sh new file mode 100755 index 00000000..e0709b30 --- /dev/null +++ b/validate-migration.sh @@ -0,0 +1,241 @@ +#!/usr/bin/env bash + +migration_log_file="" +p4_client_dir="" +bare_git_dir="" +default_workdir="/tmp/$( basename "$0" )-data" +workdir="${default_workdir}" + +# scan sub-directories; exclude the '.git' directory. +diff_args=("--recursive" "--exclude=.git") +show_help=0 +forced=0 +debug=0 + +for arg in "$@" ; do + case "${arg}" in + --force) + forced=1 + ;; + --logfile=*) + migration_log_file="${arg:10}" + ;; + --p4workdir=*) + p4_client_dir="${arg:12}" + ;; + --gitdir=*) + bare_git_dir="${arg:9}" + ;; + --datadir=*) + workdir="${arg:10}" + ;; + --ignore-eol) + diff_args+=("--ignore-trailing-space") + ;; + --debug) + debug=1 + ;; + --help) + show_help=1 + ;; + -h) + show_help=1 + ;; + *) + echo "Unknown argument: ${arg}" + show_help=1 + ;; + esac +done +if [ ! -f "${migration_log_file}" ] || [ -z "${p4_client_dir}" ] || [ ! -d "${bare_git_dir}" ] || [ ${forced} = 0 ] ; then + show_help=1 +fi + +if [ ${debug} = 1 ] ; then + echo "logfile=${migration_log_file}" + echo "p4_client_dir=${p4_client_dir}" + echo "bare_git_dir=${bare_git_dir}" + echo "workdir=${workdir}" + echo "diff_args=${diff_args[*]}" +fi + +if [ ${show_help} = 1 ] ; then + echo "Usage: $( dirname "$0" ) (arguments)" + echo "where:" + echo " --force" + echo " Force operation. The tool will not run without this." + echo " Required." + echo " --logfile=(migration log file)" + echo " The location of the captured output from running the p4-fusion command." + echo " This must include a complete listing of all commits." + echo " Make sure this isn't located in the data directory." + echo " Required." + echo " --p4workdir=(p4 workspace directory)" + echo " Location of the local directory mapped to the Perforce workspace" + echo " for the migrated depot '--path' argument." + echo " Required." + echo " --gitdir=(generated git directory)" + echo " The generated, bare Git directory from the '--src' argument." + echo " This directory will not be changed." + echo " Required." + echo " --datadir=(working directory)" + echo " Location where data files and a cloned Git repository are created." + echo " Defaults to ${default_workdir}" + echo " The contents of this directory will be wiped! Be careful reusing a directory." + echo " --ignore-eol" + echo " When performing the difference, ignore the whitespace at the end of lines." + echo " --debug" + echo " Report progress information." + echo " --help" + echo " This text." + echo "" + echo "Additionally, you must have your Perforce environment configured such that" + echo "running 'p4' will be able to sync files to the workspace directory." + echo "" + echo "WARNING:" + echo "Note that this tool will modify the contents of the workspace directory," + echo "including removing all files from it. It will also run 'p4 sync' in the" + echo "directory, causing your client's file references to change." + echo "" + echo "It also cleans out the contents of the data directory." + exit 0 +fi + + +# Ensure Perforce directory is clean, and the Perforce before we start. +if [ ${debug} = 1 ]; then + echo "Cleaning out Perforce client and directory under ${p4_client_dir}" +fi +test -d "${p4_client_dir}" || mkdir -p "${p4_client_dir}" +( cd "${p4_client_dir}" && p4 sync -f -q "...#0" ) || exit 1 +rm -rf "${p4_client_dir}" || exit 1 +mkdir -p "${p4_client_dir}" + +# Set up our work directory. +test -d "${workdir}" && rm -rf "${workdir}" +mkdir -p "${workdir}" +mkdir "${workdir}/diffs" + +# Save off a copy of the migration log. +cp "${migration_log_file}" "${workdir}/migration-log.txt" + +# Extract out the commit/changelist history. +# Each extracted line is in the format "(commit sha):(p4 changelist):(branch name)" +grep -E '] COMMIT:' "${workdir}/migration-log.txt" \ + | cut -f 2- -d ']' \ + | cut -f 2- -d ':' \ + > "${workdir}/history.txt" + +# Make a full file copy of the bare Git repository. +if [ ${debug} = 1 ]; then + echo "Copying Git repository into ${workdir}/repo" +fi +( cd "${workdir}" && git clone "${bare_git_dir}" repo ) + +# Process every commit. +while read line ; do + # The Git commit SHA. + gitcommit="$( echo "${line}" | cut -f 1 -d ':' )" + + # Shorten up the SHA; some Perforce changelists may map to multiple + # commits, so we need this as a distiguisher. + gitcommit_short="${gitcommit:0:5}" + + # Perforce changelist + p4cl="$( echo "${line}" | cut -f 2 -d ':' )" + + # Branch the commit happened in, and the source Perforce branch. + # Empty if there was no branch specified when running p4-fusion. + branch="$( echo "${line}" | cut -f 3 -d ':' )" + + # The relative P4 depot path for pulling files in the changelist. + p4DepotPath="${branch}/...@${p4cl}" + + # The output directory where the perforce files are placed. + # Allows for clean diff between the git repo and the Perforce files. + p4dir="${p4_client_dir}/${branch}" + + # The name of the Git branch to create. Again, because the changelist + # may have multiple commits, we need to make them distinctly named. + gitbranch="${branch}-${p4cl}-${gitcommit_short}" + + # The output differences file. The changelist is first in the bits of the + # name, to allow later comparisons to be easier. + diff_file="${workdir}/diffs/diff-${p4cl}-${branch}-${gitcommit_short}.txt" + + if [ -z "${branch}" ] ; then + # No branching done for the p4-fusion execution, so strip branch + # specific parts off of the names. + p4DepotPath="...@${p4cl}" + p4dir="${p4_client_dir}" + gitbranch="br-${p4cl}-${gitcommit_short}" + diff_file="${workdir}/diffs/diff-${p4cl}.txt" + fi + + echo "${p4cl} - ${gitcommit}" >> "${workdir}/progress.txt" + + # Fetch all the files. + # The Git checkout and Perforce sync are performed in parallel. + + # Make a clean checkout in Git of the commit. + # This ensures the files are exactly what's in the commit. + if [ ${debug} = 1 ]; then + echo "Switching to Git Commit ${gitcommit}" + fi + ( cd "${workdir}/repo" && git checkout -b "${gitbranch}" "${gitcommit}" >/dev/null 2>&1 && git reset --hard >/dev/null 2>&1 && git clean -f -d >/dev/null 2>&1 ) & + j1=$! + + # Have the Perforce depot path match that specific changelist state. + # Because the directory was cleaned out before the start, it should be in a pristine state + # after running. + # Note: not sync -f, because that's not necessary. + if [ ${debug} = 1 ]; then + echo "Fetching ${p4DepotPath}" + fi + ( cd "${p4_client_dir}" && p4 sync -q "${p4DepotPath}" ) & + j2=$! + + # Wait for the checkout and sync to complete. + wait $j1 $j2 + + # Discover differences. + if [ ${debug} = 1 ]; then + echo "Writing diff into ${diff_file}" + fi + diff "${diff_args[@]}" "${workdir}/repo" "${p4dir}" > "${diff_file}" + if [ -s "${diff_file}" ] ; then + echo "${p4cl}:${gitcommit}:${diffile}" >> "${workdir}/commit-differences.txt" + echo "${p4cl}" >> "${workdir}/all-changelist-differences.txt" + fi +done < "${workdir}/history.txt" + +# For the error detection, only loop through unique changelists. +sort < "${workdir}/all-changelist-differences.txt" | uniq > "${workdir}/changelist-differences.txt" + +error_count=0 + +if [ ${debug} = 1 ]; then + echo "Discovering problems." +fi +while read p4cl ; do + # This changelist had at least 1 corresponding commit with a problem. + # If there is some commit with the same changelist with no problem, + # then that means it eventually matched. + # Note that, with the splat pattern, non-branch runs will always have + # this changelist be marked as a problem. + is_error=1 + file_list=() + for diff in "${workdir}/diffs/diff-${p4cl}-"*.txt ; do + file_list+=("$( basename "${diff}" )") + if [ ! -s "${diff}" ] ; then + is_error=0 + fi + done + if [ ${is_error} = 1 ] ; then + error_count=$(( error_count + 1 )) + echo "${p4cl} ${file_list[*]}" >> "${workdir}/errors.txt" + echo "ERROR: changelist ${p4cl}" + fi +done < "${workdir}/changelist-differences.txt" +echo "${error_count} problems discovered. Complete list in ${workdir}/errors.txt and ${workdir}/commit-differences.txt" +exit ${error_count}