Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/dev_dc1_fleg' into dev_dc1_fleg
Browse files Browse the repository at this point in the history
  • Loading branch information
LEGRAND Francois committed Mar 3, 2023
2 parents a1054d1 + 1346e55 commit f7d6fa0
Show file tree
Hide file tree
Showing 7 changed files with 157 additions and 72 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/tests_with_docker.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
name: Tests with docker
on:
push:
paths-ignore:
- docs/requirements.txt
- examples/**
paths:
- grand/**
- tests/**

jobs:
Linux:
Expand Down
14 changes: 7 additions & 7 deletions examples/datalib/config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@ CC = ["ssh","cca.in2p3.fr",22,["/sps/trend/pengxiong/GP81_interpolation/GP81_100
WEB = [ "https", "github.com" , 443, ["/grand-mother/data_challenge1/raw/main/coarse_subei_traces_root/"]]

; Credentials for repositories given as :
; Name = [user, password, keyfile, keypasswd]
; Name = [user, keyfile]
; where name is the name of the repository
; For security reasons it is highly recommended NOT to provide any sensitive information as password keyfile or keypasswd
; in this file. For example, if protocol is ssh, it's better to use an ssh-agent
; To run an ssh-agent just do : eval $(ssh-agent) and ssh-add .ssh/id_rsa
; To export your ssh agent from host to docker simply add an environment variable SSH_AUTH_SOCK=/ssh-agent to your docker
; and mount the volume ${SSH_AUTH_SOCK}:/ssh-agent
; This section allows you to specify your login and optionally a key file to access repositories or connect database though an ssh tunnel etc...
; For security reasons you will not be allowed to provide sensitive information as password in this file.
; If password is required (e.g. to decrypt the key file) it will be asked interactively.
; For ssh protocol, it's highly encouraged to use an ssh-agent (to avoid to have to provide passwd interactively at each run)
; To run an ssh-agent just do : `eval $(ssh-agent)` and `ssh-add .ssh/id_rsa`
[credentials]
CC = ["legrand","","",""]
CC = ["legrand",""]

; database to use (only one database can be defined)
; Name = [server, port, database, login, passwd, sshtunnel_server, sshtunnel_port, sshtunnel_credentials ]
Expand Down
13 changes: 6 additions & 7 deletions granddb/config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ socket_timeout = 5
;If data is not found in local directories but found in a repository,
; it will be copied in the first localdir of the list (e.g. .incoming).
; At least one localdir (incoming) is needed.
; incoming directory must be an absolute path
; incoming directory must exists, be an absolute path and be writable
[directories]
localdir = ["/home/fleg/DEV/GRAND/incoming", "/home/fleg/DEV/GRAND/grand/granddb/incoming/"]
localdir = ["/home/fleg/DEV/GRAND/incoming", "/home/fleg/GRAND/"]

; remote repositories to search for data if not present in local directories
; repositories are given as list :
Expand All @@ -34,15 +34,14 @@ WEB = [ "https", "github.com" , 443, ["/grand-mother/data_challenge1/raw/main/co
; To export your ssh agent from host to docker simply add an environment variable SSH_AUTH_SOCK=/ssh-agent to your docker
; and mount the volume ${SSH_AUTH_SOCK}:/ssh-agent
[credentials]
CC = ["legrand","","",""]
CCIN2P3 = ["legrand","","",""]
SSHTUNNEL = ["fleg","","",""]
CC = ["legrand",""]
CCIN2P3 = ["legrand",""]
SSHTUNNEL = ["fleg",""]

; database to use (only one database can be defined)
; Name = [server, port, database, login, passwd, sshtunnel_server, sshtunnel_port, sshtunnel_credentials ]
[database]
#database = ["lpndocker01.in2p3.fr", "" ,"granddb", "postgres", "password","", 22, ""]
database = ["lpndocker01.in2p3.fr", "" ,"granddb", "postgres", "password","lpnclaude.in2p3.fr", 22, "SSHTUNNEL"]
database = ["localhost", "" ,"granddb", "postgres", "password","sshtun.in2p3.fr", 22, "SSHTUNNEL"]

; The following section is optional.
; it defines the repository where registered files need to go.
Expand Down
124 changes: 90 additions & 34 deletions granddb/granddatalib.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
#import logging
from pathlib import Path
import scp
import paramiko
Expand All @@ -11,7 +10,7 @@
import socket
import grand.manage_log as mlg
import copy

import getpass

# specific logger definition for script because __mane__ is "__main__" !
logger = mlg.get_logger_for_script(__name__)
Expand All @@ -38,7 +37,7 @@
# [repositories]
# Repo1 = ["protocol","server",port,["/directory1/", "/other/dir/"]]
# [credentials]
# Repo1 = ["user","password","keyfile","keypasswd"]
# Repo1 = ["user","keyfile"]
# [database]
# localdb = ["host", port, "dbname", "user", "password", "sshtunnel_server", sshtunnel_port, "sshtunnel_credentials" ]]
# [registerer]
Expand Down Expand Up @@ -79,7 +78,7 @@ def __init__(self, file="config.ini"):
if configur.has_section('credentials'):
for name in configur['credentials']:
cred = json.loads(configur.get('credentials', name))
self._credentials[name] = Credentials(name, cred[0], cred[1], cred[2], cred[3])
self._credentials[name] = Credentials(name, cred[0], cred[1])

if configur.has_section('directories'):
# Get localdirs (the first in the list is the incoming)
Expand All @@ -90,6 +89,11 @@ def __init__(self, file="config.ini"):
# Add trailing slash if needed
dirlist = [os.path.join(path, "") for path in dirlist]
self._incoming = dirlist[0]
# Check that incoming directory exists
my_path = Path(self._incoming)
if not my_path.exists():
logger.error(f"Incoming directory {self._incoming} does not exists.")
exit(1)
self._directories.append(Datasource("localdir", "local", "localhost", "", dirlist, self.incoming()))
# We also append localdirs to repositories... so search method will first look at local dirs before searching on remote locations
# self._repositories.append(Datasource("localdir", "local", "localhost", "", dirlist, self.incoming()))
Expand Down Expand Up @@ -236,12 +240,12 @@ class Credentials:
_keyfile: str
_keypasswd: str

def __init__(self, name, user, password, keyfile, keypasswd):
def __init__(self, name, user, keyfile):
self._name = name
self._user = user
self._password = password
self._password = ""
self._keyfile = keyfile
self._keypasswd = keypasswd
self._keypasswd = ""

def name(self):
return self._name
Expand Down Expand Up @@ -283,7 +287,7 @@ def __init__(self, name, protocol, server, port, paths, incoming, id_repo=None):
self._port = port
self._paths = [os.path.join(path, "") for path in paths]
# By default no credentials
self._credentials = Credentials(name, "", "", "", "")
self._credentials = Credentials(name, "", "")
self._incoming = incoming
self.id_repository = id_repo
if protocol == 'ssh':
Expand Down Expand Up @@ -351,22 +355,44 @@ class DatasourceLocal(Datasource):
def get(self, file, path=None):
# TODO : Check that path is in self.paths(), if not then copy in incoming ?
found_file = None
# Path is given : we only search in that path
if not (path is None):
my_path = Path(path)
if not my_path.exists():
logger.warning(f"path {path} not found (seems not exists) ! Check that path is mounted if you run in docker !")
my_file = Path(path + file)
if my_file.is_file():
found_file = path + file
else:
logger.warning(f"path {path} not found (seems not exists) ! Check that it is mounted if you run in docker !")

my_file = None
liste = list(Path(path).rglob(file))
for my_file in liste:
if my_file.is_file():
found_file = my_file
break

if my_file is None:
logger.debug(f"file {file} not found in localdir {path}")
#my_file = Path(path + file)

#if my_file.is_file():
# found_file = path + file
#else:
# logger.debug(f"file {file} not found in localdir {path}")
else:
# No path given : we recursively search in all dirs and subdirs
for path in self.paths():
logger.debug(f"search in localdir {path}{file}")
my_file = Path(path + file)
if my_file.is_file():
found_file = path + file

#my_file = Path(path + file)
my_file = None
liste = list(Path(path).rglob(file))
for my_file in liste:
if my_file.is_file():
found_file = my_file
break
if not my_file is None and my_file.is_file():
break
#if my_file.is_file():
# found_file = path + file
# break
else:
logger.debug(f"file {file} not found in localdir {path}")

Expand All @@ -389,32 +415,58 @@ def copy(self, pathfile):
# @author Fleg
# @date Sept 2022
class DatasourceSsh(Datasource):

def set_client(self):
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=self.server(),
# function to set up the ssh connection. If recurse=True (default) then in case of failure the passwords will be asked and a second attempt will be made.
# If recurse=False (at the second attempt) then in case of failure a error message is raised and the return is set to none (which will made the search skipped)
def set_client(self, recurse=True):
try:
client = paramiko.SSHClient()
if self.credentials().keyfile() != "" and recurse is True:
self.credentials()._keypasswd = getpass.getpass(prompt="Please give password to decrypt keyfile " + self.credentials().keyfile() + ": ")
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=self.server(),
port=self.port() if self.port() != "" else None,
username=self.credentials().user() if self.credentials().user() != "" else None,
password=self.credentials().password() if self.credentials().password() != "" else None,
key_filename=self.credentials().keyfile() if self.credentials().keyfile() != "" else None,
passphrase=self.credentials().keypasswd() if self.credentials().keypasswd() != "" else None)
except paramiko.AuthenticationException as e:
if recurse :
self.credentials()._password = getpass.getpass(prompt="Please give password for user " + self.credentials().user() + " @ " + self.server() + ":")
client = self.set_client(False)
else:
logger.error(f"Authentication error {e} during connection to {self.server()}")
client = None
except paramiko.SSHException as e:
if recurse :
self.credentials()._password = getpass.getpass(prompt="Please give password for user " + self.credentials().user() + " @ " + self.server() + ":")
client = self.set_client(False)
else:
logger.error(f"Error {e} during connection to {self.server()}")
client = None
return client

def get(self, file, path=None):
import getpass
localfile = None
client = self.set_client()
if not (path is None):
logger.debug(f"search {path}{file} @ {self.name()}")
localfile = self.get_file(client, path, file)
else:
for path in self.paths():
logger.debug(f"search {path}{file} @ {self.name()}")
if not(client is None):
if not (path is None):
logger.debug(f"search {file} in {path} @ {self.name()}")
localfile = self.get_file(client, path, file)
if not (localfile is None):
break
else:
logger.debug(f"file not found in {path}{file} @ {self.name()}")
if (localfile is None):
logger.debug(f"file {file} not found in {path} @ {self.name()}")
else:
for path in self.paths():
logger.debug(f"search {file} in {path}@ {self.name()}")
localfile = self.get_file(client, path, file)
if not (localfile is None):
break
else:
logger.debug(f"file {file} not found in {path} @ {self.name()}")
else:
logger.debug(f"Search in repository {self.name()} is skipped")


return localfile

## Search for files in remote location accessed through ssh.
Expand All @@ -423,9 +475,13 @@ def get(self, file, path=None):

def get_file(self, client, path, file):
localfile = None
stdin, stdout, stderr = client.exec_command('ls ' + path + file)
lines = list(map(lambda s: s.strip(), stdout.readlines()))
if len(lines) == 1:
#stdin, stdout, stderr = client.exec_command('ls ' + path + file)
#lines = list(map(lambda s: s.strip(), stdout.readlines()))

stdin, stdout, stderr = client.exec_command('find ' + path + " -type f -name " + file)
lines = sorted(list(map(lambda s: s.strip(), stdout.readlines())), key=len)
if len(lines) >= 1:
#if len(lines) == 1:
logger.debug(f"file found in repository {self.name()} @ " + lines[0].strip('\n'))
logger.debug(f"copy to {self.incoming()}{file}")
scpp = scp.SCPClient(client.get_transport())
Expand Down
1 change: 1 addition & 0 deletions granddb/granddblib.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ def __init__(self, host, port, dbname, user, passwd, sshserv="", sshport=22, cre
self._cred = cred

if self._sshserv != "" and self._cred is not None:
#TODO: Check credentials for ssh tunnel and ask for passwds
self.server = SSHTunnelForwarder(
(self._sshserv, self.sshport()),
ssh_username=self._cred.user(),
Expand Down
47 changes: 35 additions & 12 deletions granddb/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ Inifile is organized in sections. The 6 sections are [general][directories][repo
WEB = [ "https", "github.com" , 443, ["/grand-mother/data_challenge1/raw/main/coarse_subei_traces_root/"]]

[credentials]
; Name = [user, password, keyfile, keypasswd]
CC = ["login","","",""]
CCIN2P3 = ["login","","",""]
SSHTUNNEL = ["ssh_login","","",""]
; Name = [user, keyfile]
CC = ["login",""]
CCIN2P3 = ["login",""]
SSHTUNNEL = ["ssh_login",""]

[database]
; Name = [server, port, database, login, passwd, sshtunnel_server, sshtunnel_port, sshtunnel_credentials ]
Expand All @@ -35,29 +35,30 @@ Inifile is organized in sections. The 6 sections are [general][directories][repo
CCIN2P3 = "/sps/trend/fleg/INCOMING"


Directories are **local** directories where data should be. The first path in localdir will be used as an incoming folder (see below).
Directories are **local** directories where data should be. The first path in localdir will be used as an incoming folder (also see below). The incoming folder is the local folder where the files found remotely will be copied. This directory must exists and be writable.
Repositories are **distant** places where data should be. Repositories are accessed using a protocol.

The following protocols are supported : ssh, http, https, local.

Sections [database] and [registerer] are optional (these sections can be commented or removed if you don't want to use the database).

For security reasons it is highly recommended NOT to provide any sensitive information as password keyfile or keypasswd
in this file. For example, if protocol is ssh, it's better to use an ssh-agent
[credentials] section allows you to specify your login and optionally a key file to access repositories or connect database though an ssh tunnel etc...
For security reasons you will not be allowed to provide sensitive information as password in this file. If password is required (e.g. to decrypt the key file) it will be asked interactively.
For ssh protocol, it's highly encouraged to use an ssh-agent (to avoid to have to provide passwd interactively at each run)
To run an ssh-agent just do : `eval $(ssh-agent)` and `ssh-add .ssh/id_rsa`

To export your ssh agent from host to docker simply add an environment variable SSH_AUTH_SOCK=/ssh-agent to your docker
and mount the volume with `-v ${SSH_AUTH_SOCK}:/ssh-agent`

## Datamanager
When instantiate, a datamanager object will read it's configuration from the ini file. If a database is declared, it will connect to the DB to get a list of eventual other repositories.
When instantiated, a datamanager object will read it's configuration from the ini file. If a database is declared, it will connect to the DB to get a list of eventual other repositories.

### The get function
The get(filename) function fill perform the following actions :
- Search if a file called < filename > exists in localdirs.
- If yes, returns the path to the file.
- If no, search for the file in the various repositories.
- If found in a repository, then get the file (using protocol for the repository) and copy it in the incoming local directory and return the path to the newly copied file.
- Search if a file called < filename > exists in localdirs (and subdirs).
- If yes, returns the path to the first file found.
- If no, recursively search for the file in the various repositories.
- If found in a repository, then get the first file found (using protocol for the repository) and copy it in the incoming local directory and return the path to the newly copied file.
- If not, return None.

Usage example:
Expand Down Expand Up @@ -90,6 +91,7 @@ The search function (not yet properly implemented) will return the list of repos
It will perform a search in the database.

### Test example
#### For linux users

To test, you can do the following :

Expand All @@ -109,3 +111,24 @@ To test, you can do the following :
* Check that the Coarse3.root has been retreived in /home/examples/datalib/incoming

ls /home/examples/datalib/incoming/Coarse3.root

#### For mac users
Mac does'nt allow to forward agent into docker... thus you will have to start an agent directly inside your docker :
* Edit and configure the examples/datalib/config.ini
* Run the docker

docker run -it -v /path/to/grand/lib:/home -v /path/to/.ssh:/home/.ssh --rm grandlib/dev:1.2


* Inside the docker do :


eval $(ssh-agent)
ssh-add .ssh/id_rsa
source env/setup.sh
cd examples/datalib/
python datamanager_example.py

* Check that the Coarse3.root has been retreived in /home/examples/datalib/incoming

ls /home/examples/datalib/incoming/Coarse3.root
Loading

0 comments on commit f7d6fa0

Please sign in to comment.