Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-1188] Maven Publishing Test Environments #13707

Merged
merged 1 commit into from
Jan 10, 2019

Conversation

zachgk
Copy link
Contributor

@zachgk zachgk commented Dec 21, 2018

Description

This PR is the second part of the maven Jenkins publishing that adds testing of the produced packages on 16.04, 18.04, and centOS. It also creates dedicated publishing dockerfiles (on 14.04) and testing dockerfiles that are built without any dependencies to ensure that the static linking worked properly.

Because the packages built on this do not yet use static linking, it instead tests against the snapshot repository.

Note that this PR depends on (and is rebased on top of) #13450. When reviewing the files, please look at only the changes in d4ac689.

Please review so the pipeline can be put into production:
@nswamy @lanking520 @andrewfayres @piyushghai @marcoabreu @yzhliu

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@zachgk
Copy link
Contributor Author

zachgk commented Dec 21, 2018

@mxnet-label-bot add [CI, Java, Maven, Scala, pr-awaiting-review]

@marcoabreu marcoabreu added CI Java Label to identify Java API component Maven pr-awaiting-review PR is waiting for code review Scala labels Dec 21, 2018
Copy link
Contributor

@marcoabreu marcoabreu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved d4ac689 only (as requested in the description)

@sandeep-krishnamurthy
Copy link
Contributor

@frankfliu - You may be interested in this PR

Copy link
Member

@lanking520 lanking520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, please address the comments

ci/docker/install/centos7_scala.sh Show resolved Hide resolved
Copy link
Member

@lanking520 lanking520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Please also apply changes to the scripts in dev folder which reflect Frank's build changes

ci/docker/Dockerfile.publish.test.ubuntu1404_gpu Outdated Show resolved Hide resolved
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14.04, same as the one below

@lanking520
Copy link
Member

@aaronmarkham I tried a couple of time to restart the tests however it seemed not to be flaky. Could you please take a look in here?

@@ -92,22 +92,24 @@ def get_dockerfiles_path():

def get_platforms(path: str = get_dockerfiles_path()) -> List[str]:
"""Get a list of architectures given our dockerfiles"""
dockerfiles = glob.glob(os.path.join(path, "Dockerfile.build.*"))
dockerfiles = glob.glob(os.path.join(path, "Dockerfile.*"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the files added by you have Dockerfile.publish as the naming convention.
Should we add ".publish" as well to the wildcard file search ? Since for the purpose of this method is to get platforms for publishing artifacts.
WDYT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the main CI code and tries to find all Dockerfiles (for caching purposes). So, it should cover both the .build and .publish dockerfiles.

@aaronmarkham
Copy link
Contributor

@aaronmarkham I tried a couple of time to restart the tests however it seemed not to be flaky. Could you please take a look in here?

I've seen this "bad address" error come up a few times recently. Restarting the CI pipeline for the failed test seems to work. I restarted it.

set -ex
apt-get update || true
apt-get install -y \
build-essential \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all of this when it happens with ubuntu_core.sh? What's the difference supposed to be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is the mxnet dependencies (opencv, cuda, blas, etc.). Core should include them because it uses them to build and test mxnet throughout the CI. base should not include them because it is only for testing whether the static linking works in the publishing pipeline and the tests might erroneously pass if the dependencies are installed on the system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base should not include them because it is only for testing whether the static linking works in the publishing pipeline and the tests might erroneously pass if the dependencies are installed on the system.

@zachgk In that case we better test on a vanilla system? Why install all the development dependencies listed in ubuntu_base.sh if only testing the binaries? Could you help to clarify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is less to test binaries than to test Scala. So, we still need a few dependencies (jdk, maven, git, wget, etc). It's mostly the dependencies that are statically linked that need to be excluded.

Copy link
Member

@lanking520 lanking520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@andrewfayres andrewfayres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lanking520 lanking520 merged commit 30b552d into apache:master Jan 10, 2019
zhaoyao73 added a commit to zhaoyao73/incubator-mxnet that referenced this pull request Jan 11, 2019
* upstream/master: (109 commits)
  Code modification for  testcases of various network models in directory example (apache#12498)
  [CI] Prevent timeouts when rebuilding containers with docker. (apache#13818)
  fix Makefile for rpkg (apache#13590)
  change to compile time (apache#13835)
  Disabled flaky test (apache#13758)
  Improve license_header tool by only traversing files under revision c… (apache#13803)
  Removes unneeded nvidia driver ppa installation (apache#13814)
  Add Local test stage and option to jump directly to menu item from commandline (apache#13809)
  Remove MXNET_STORAGE_FALLBACK_LOG_VERBOSE from test_autograd.py (apache#13830)
  Fix scala doc build break for v1.3.1 (apache#13820)
  [MXNET-1263] Unit Tests for Java Predictor and Object Detector APIs (apache#13794)
  [MXNET-1260] Float64 DType computation support in Scala/Java (apache#13678)
  onnx export ops (apache#13821)
  [MXNET-880] ONNX export: Random uniform, Random normal, MaxRoiPool (apache#13676)
  fix minor indentation (apache#13827)
  Fixing a symlink issue with R install (apache#13708)
  remove useless code (apache#13777)
  ONNX ops: norm exported and lpnormalization imported (apache#13806)
  Add new Maven build for Scala package (apache#13819)
  Dockerfiles for Publish Testing (apache#13707)
  ...
@zachgk zachgk deleted the testMaven branch April 12, 2019 22:25
lanking520 pushed a commit to lanking520/incubator-mxnet that referenced this pull request Apr 26, 2019
lanking520 pushed a commit to lanking520/incubator-mxnet that referenced this pull request Apr 30, 2019
zachgk added a commit to zachgk/incubator-mxnet that referenced this pull request May 16, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI Java Label to identify Java API component Maven pr-awaiting-review PR is waiting for code review Scala
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants