Porting a source project into PID
In this tutorial we propose general guidelines to port existing source code into PID. Basically there is two possible starting points:
- you have some source code with no clean project structure.
- you already have an existing source project. The port of yaml-cpp project will be demonstrated as an example.
In the following parts of this tutorial we will refer two those two scenarios because depending on the situation, the actions to perform may vary.
Step 1: Identify components
First thing to do is to analyze the source code in order to extract the component that will be generated by the package:
-
For raw source code: there is no explicit definition of components so you have to find them directly from source. The basic idea is to discriminate the source code that can be reused from other parts of thde code. Source code providing reusable functionalities should be identified as a library, other codes are so applications.
-
For structured existing projects: as there is already a description of the components the best thing to do first is to reuse this structure. If CMake description of project is provided, it is simple to identify libraries (defined with
add_library
command) and applications (add_executable
) and their related source code.
Step 2: Identify dependencies
Once you have identified all components you have to identify what are their dependencies in order to define the package dependencies.
-
For raw source code: the most common way to find those dependencies is to look at headers included in source code. This way you may first define relations between your libraries and applications, and then identify what does not belong to your project. For instance if you find an include directive like
#include <boost/filesystem.hpp>
it means that the corresponding component needs theboost filesystem
library. So you can deduce that the new package will depend onboost
external package. -
For structured existing projects: the dependencies can be more or less explicit in structured projects. In CMake based projects you should look at
find_package
(and other commands likefind_library
,find_path
, etc.) because they define package dependencies. Also, dependencies of components can be found when looking attarget_link_libraries
commands (and other liketarget_compile_options
,target_compile_definitions
, etc.). Finding the relation between the target in use (corresponding to a library, e.gBoost filesystem
) and package they belong to (e.g.Boost
) should not be a problem in most cases.
Once done you know what are the dependencies of the package you will create as well as components it will generate.
Step 3: Create the PID package
Now you identified what to do, you need to create the new package. We call it newpack
.
3.1: create the package in your local workspace
cd <pid-workspace>
pid create package=newpack
cd <pid-workspace>/packages/newpack/build
- For structured existing projects that already have a source repository: it can be interesting to preserve the existing history of revisions instead of simply copying the source code in
newpack
. We supposenewpack
a port ofyaml-cpp
project repository, that can be found at https://github.com/jbeder/yaml-cpp.git.
To do this:
- configure a remote of the package to make it point to the existing project URL.
cd <pid-workspace>/packages/newpack
git remote add existing https://github.com/jbeder/yaml-cpp.git
- update package content with content of the existing project (then optionaly suppress connection between both)
git pull existing master #use the good branch
git remote remove existing #optional
The pull
operation has generated lot of noise (including a lot of conflicts) into newpack
but it also garantee tat the history of commits has been preserved so new commits will simply added to original project’s ones.
The first thing to do is to solve conflicts, here are the general rules:
README.md
of the original project may be set empty because it will be regenerated by PID in the end.- conflicting
.gitignore
files should be let as they were generated by PID (use theHEAD
alternative when resolving conflicts). - conflicting
CMakeLists.txt
file should contain the fuse content of both version: PID description is let as is WHILE original project description is put into comments (keep it available). The later will be removed after package description is finished.
Another important aspect is to set the version adequately.
- For raw source code: simply keep the version as is (version should be 0.1.0).
- For structured existing projects: we recommand to start from next version of the original project. For instance at the time this tutorial has been written the
yaml-cpp
project was in version 0.6.2, we recommand to directly set this version to 1.0.0 this way your code is explicitly continuing project history but with a version that will be no more compatible with existing ones (changing the major digit of the version number).
3.1: create a repository newpack
in you git hosting solution
Then before starting to work on the package, save the history:
git add --all
git commit -m "after merging original project into PID package"
Now create an online repository newpack
in your preferred git hosting solution, copy its address, then do:
cd <pid-workspace>
pid connect package=newpack url=<online url on newpack>
Your package is connected, now simply push the content to the online repository in order to save your work:
cd <pid-workspace>/packages/newpack/build
git push origin integration
Step 4: Organizing the package
Now you have to (re)structure content of the package to make it follow PID package description rules:
4.1: managing libraries
1) For each library you identified create a folder whose name is the name of the library:
- in your package’s
src
andinclude
folder if component is a static or shared library. - only in your package’s
include
folder if component is a header library. - only in your package’s
src
folder if component is a module library (it is a shared object without explicit headers, intended to be loaded at runtime only and that implements a predefined protocol).
For instance yaml-cpp
defines one static library, also named yaml-cpp
. The yaml-cpp
original project allows to build this libray either as a static library (default) or as shared library. We decide to build the library as a static library only for now. So we create a folder called yaml-cpp
into both include
and src
folders of the package.
2) For each library put the adequate source code into its corresponding folder.
In a lot of existing project, public headers are already separated from internal includes because they are put into an include
folder. This is the case for yaml-cpp
project so we simply moved all elements from include
folder to include/yaml-cpp
folder (i.e. into the folder we just created). The original project include
folder already contains a folder named yaml-cpp
we simply move it into a folder with same name.
Source code of yaml-cpp
library is directly contained in src
, so we simply move it into a new src/yaml-cpp
folder.
3) For each library, in src/CMakeLists.txt
:
- declare the library and make it target the corresponding folder, for instance in
yaml-cpp
we identified only one static library:
PID_Component(STATIC NAME yaml-cpp DIRECTORY yaml-cpp)
- If your libraries have dependencies between themselves directly set those dependencies using
PID_Component_Dependency
.
yaml-cpp
has no dependencies so no need to bother with that.
- For structured existing projects with exlicit description (e.g. with CMake) look at the CMake description to know if libraries have specific flags to define.
In original project CMakeLists.txt we see:
# Set minimum C++ to 2011 standards
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
This means that yaml-cpp is a C++11 library so we need to set the corresponding property:
PID_Component(STATIC NAME yaml-cpp DIRECTORY yaml-cpp
CXX_STANDARD 11)
4.2: managing applications
1) For each executable you identified create a folder whose name is the name of the executable:
- in your package
test
folder if component implement unit test for some of your library. - in your package
apps
folder otherwise.
in yaml-cpp
original project we identified 3 applications (called parse
, read
and sandbox
) and 1 test unit (called run-tests
) so we defined following folders:
parse
,read
andsandbox
are created inapps
folder.run_tests
is created intest
folder.
2) For each executable put the adequate source code into its corresponding folder.
For example, source code for parse
is really simple to find since it is directly specified in one of the CMakeLists.txt
of the original project. The code is a simple file named parse.cpp
. We simply move this source file into the parse
folder. We repeat the operation for other applications.
Organizing the test unit is a bit more complicated as the code is a bit more complex but in the end it finishes in putting all the code in test
folder into test/run_tests
folder, except the CMakeLists.txt
. This later still contains the original project configuration code and it will be usefull in following steps.
3) For each test unit, in test/CMakeLists.txt
:
PID_Component(TEST NAME run-tests DIRECTORY run_tests
DEPEND yaml-cpp)
We can see from original project that the test unit requires the googletest
package. It means that googletest
must be defined as a dependency of the package and run-tests
depends on component gmock
provided by this dependency:
PID_Component(TEST NAME run-tests DIRECTORY run_tests
DEPEND yaml-cpp googletest/gmock)
run_PID_test(NAME yaml-test COMPONENT run-tests)
The run_PID_test
define a test case yaml-test
that simply executes run-tests
executable.
4) For each application, in apps/CMakeLists.txt
:
- if this application is a usefull tool or program for end users:
PID_Component(APPLICATION NAME read DIRECTORY read
DEPEND yaml-cpp)
- if this application is just an example on how to use the libraries:
PID_Component(APPLICATION NAME parse DIRECTORY parse
DEPEND yaml-cpp)
PID_Component(EXAMPLE NAME sandbox DIRECTORY sandbox
DEPEND yaml-cpp)
You can notice that all these applications depend on the yaml-cpp
library.
Step 5: Clean unused remaining files
For structured existing projects it may be useful to clean a bit the package after restructuring of its content. For instance yaml-cpp
project contains many files that are no more usefull: .travis.yml
, install.txt
, LICENSE
, etc.
Also after the restructuring you may have folders of the orginal project that do no more contain important data. After our restructuring of yaml-cpp
there is a folder named util
(it was containing executable source code) that is no more usefull, so we can delete it.
Step 6: write global package description
Write the gobal description is mostly done. Only thing we really have to take care from now are the dependencies of the package newpack
.
For the yaml-cpp
example, the description should look like:
cmake_minimum_required(VERSION 3.15.7)
set(WORKSPACE_DIR ${CMAKE_SOURCE_DIR}/../.. CACHE PATH "root of the packages workspace directory")
list(APPEND CMAKE_MODULE_PATH ${WORKSPACE_DIR}/cmake) # using generic scripts/modules of the workspace
include(Package_Definition NO_POLICY_SCOPE)
project(newpack)
PID_Package(
AUTHOR Your name
YEAR 2019
LICENSE MIT
DESCRIPTION "A YAML parser and emitter in C++"
VERSION 1.0.0
)
PID_Author(AUTHOR Jesse Beder)
if(BUILD_AND_RUN_TESTS) #add the dependency to googletest if we run tests.
PID_Dependency(googletest)
endif()
build_PID_Package()
AS you may notice we defined a dependency to the googletest
package and this dependency will be used only when package is tested. Since googletest is not known in PID yet, then you would have to either :
- port this external project as a native package, the same as we are doing for
yaml-cpp
. - create a wrapper for this external package.
- define a platform configuration that checks is system has
googletest
installed.
One simple solution for this tutorial is to simply avoid building tests !!
Also it is a good idea to reproduce important meta-data coming from the original project:
- it is a good practice to reuse the license used by original project authors (e.g. MIT is the license used for
yaml-cpp
). - it is very important to report original author names using the
PID_Author
command. This way their names will be reported in all parts of the documentation. Set theAUTHOR
andINSTITUTION
arguments ofPID_Package
with your personnal data this way you are explicitly the person to contact.
Now let’s try a build, everything should work as expected :
cd <pid-workspace>/packages/newpack
pid build
That’s it there is no much more to learn !!