Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying ReutersBridge #5

Merged
merged 52 commits into from
Aug 9, 2020
Merged
Changes from 40 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
97c0f0f
- Add all article from 'Editor\'s Highlight' to the feed
csisoap Jul 20, 2020
11e49ff
- Add name.
csisoap Jul 20, 2020
0ffec67
Merge branch 'reuters' into reuters-1
csisoap Jul 20, 2020
c4897ee
Update ReutersBridge.php
csisoap Jul 20, 2020
4413129
Update ReutersBridge.php
csisoap Jul 20, 2020
8285c86
Update ReutersBridge.php
csisoap Jul 21, 2020
9d57a25
Update ReutersBridge.php
csisoap Jul 24, 2020
1d6ee98
Refactor some code/Add image to the description.
csisoap Jul 27, 2020
842f5b6
Testing Reuters plugin for RSS-Bridge
csisoap Jul 27, 2020
ef27c98
Testing Reuters plugin for RSS-Bridge
csisoap Jul 27, 2020
d6c6ccf
Adjust some code
csisoap Jul 27, 2020
2f4bda8
Forgot to save before commit
csisoap Jul 27, 2020
d0e980e
Update ReutersBridge.php
csisoap Aug 2, 2020
ed411a0
Update ReutersBridge.php
csisoap Aug 3, 2020
e281d85
Add full article, author.
csisoap Aug 5, 2020
f1cf84f
Update ReutersBridge.php
csisoap Aug 5, 2020
be1796d
Update ReutersBridge.php
csisoap Aug 5, 2020
96b9c10
Update ReutersBridge.php
csisoap Aug 5, 2020
e206226
Update ReutersBridge.php
csisoap Aug 5, 2020
b73e3bf
Merge branch 'reuters-1' into master
csisoap Aug 5, 2020
f3f8eec
Delete TestReutersBridge.php
csisoap Aug 5, 2020
9deff71
Merge pull request #1 from csisoap/master
csisoap Aug 5, 2020
abd4f33
Update ReutersBridge.php
csisoap Aug 5, 2020
aa1a439
Update ReutersBridge.php
csisoap Aug 5, 2020
cde8888
Update ReutersBridge.php
csisoap Aug 5, 2020
ad3b00f
Update ReutersBridge.php
csisoap Aug 5, 2020
a021cd2
Update ReutersBridge.php
csisoap Aug 5, 2020
0f2f01b
Update ReutersBridge.php
csisoap Aug 5, 2020
7d4b4cb
Merge branch 'reuters' into reuters-1
csisoap Aug 5, 2020
07c0a4e
Update ReutersBridge.php
csisoap Aug 5, 2020
44388bd
Update ReutersBridge.php
csisoap Aug 5, 2020
c7f18a5
Update ReutersBridge.php
csisoap Aug 5, 2020
aed9298
Update ReutersBridge.php
csisoap Aug 5, 2020
d8b06cc
Update ReutersBridge.php
csisoap Aug 6, 2020
8af6551
Fix syntax for ReutersBridge
csisoap Aug 7, 2020
9b0f7aa
Fix syntax for ReutersBridge
csisoap Aug 7, 2020
1890111
Fixing indentation/syntax for ReutersBridge
csisoap Aug 7, 2020
e8d2cff
Fixing indentation/syntax for ReutersBridge
csisoap Aug 7, 2020
92dc791
Fixing indentation/syntax for ReutersBridge
csisoap Aug 7, 2020
b060cb9
Fix issue with displaying author's name
csisoap Aug 7, 2020
638843e
Fixing thing
csisoap Aug 8, 2020
339ff59
Corrected the feed's name
csisoap Aug 9, 2020
a985b14
Add type of images to support.
csisoap Aug 9, 2020
984dad6
Fix the code indenation using PHPCBF
csisoap Aug 9, 2020
06ead1b
Modify some code to to meet coding standard for RSS-Bridge
csisoap Aug 9, 2020
b4122a0
Modify some code to to meet coding standard for RSS-Bridge
csisoap Aug 9, 2020
98b563e
Fix some code
csisoap Aug 9, 2020
38c5ba1
Fix indentation issue
csisoap Aug 9, 2020
c656759
Modify some code again using PHPCBF
csisoap Aug 9, 2020
b685e4a
Fix indentation issue
csisoap Aug 9, 2020
1f85147
Fix indentation issue
csisoap Aug 9, 2020
c5f38f5
Fix indentation issue
csisoap Aug 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 196 additions & 99 deletions bridges/ReutersBridge.php
Original file line number Diff line number Diff line change
@@ -1,101 +1,198 @@
<?php
class ReutersBridge extends BridgeAbstract {

const MAINTAINER = 'hollowleviathan, spraynard, csisoap';
const NAME = 'Reuters Bridge';
const URI = 'https://reuters.com/';
const CACHE_TIMEOUT = 1800; // 30min
const DESCRIPTION = 'Returns news from Reuters';
private $feedName = self::NAME;

const ALLOWED_WIREITEM_TYPES = array(
'story'
);

const ALLOWED_TEMPLATE_TYPES = array(
'story'
);

const PARAMETERS = array(array(
'feed' => array(
'name' => 'News Feed',
'type' => 'list',
'exampleValue' => 'World',
'title' => 'Reuters feed. World, US, Tech...',
'values' => array(
'Tech' => 'tech',
'Wire' => 'wire',
'Health' => 'health',
'Business' => 'business',
'World' => 'world',
'Politics' => 'politics',
'Science' => 'science',
'Energy' => 'energy',
'Aerospace and Defence' => 'aerospace',
'China' => 'china',
'Top News' => 'home/topnews',
'Lifestyle' => 'lifestyle',
'Markets' => 'markets',
'Sports' => 'sports',
'Pic of the Day' => 'pictures', // This has a different configuration than the others.
'USA News' => 'us'
)
),
));

private function getJson($feedname) {
$uri = "https://wireapi.reuters.com/v8/feed/rapp/us/tabbar/feeds/$feedname";
$returned_data = getContents($uri);
return json_decode($returned_data, true);
}

public function getName() {
return $this->feedName;
}

public function collectData() {
$feed = $this->getInput('feed');
$data = $this->getJson($feed);
$reuters_wireitems = $data['wireitems'];
$this->feedName = $data['wire_name'] . ' | Reuters';
/**
* Gets a list of wire items which are groups of templates
*/
$reuters_allowed_wireitems = array_filter(
$reuters_wireitems, function ($wireitem) {
return in_array($wireitem['wireitem_type'], self::ALLOWED_WIREITEM_TYPES);
}
);

/**
* Gets a list of "Templates", which is data containing a story
*/
$reuters_wireitem_templates = array_reduce(
$reuters_allowed_wireitems, function (array $carry, array $wireitem) {
$wireitem_templates = $wireitem['templates'];
return array_merge(
$carry, array_filter(
$wireitem_templates, function (array $template_data) {
return in_array($template_data['type'], self::ALLOWED_TEMPLATE_TYPES);
}
)
);
}, array()
);

// Check to see if there have Editor's Highlight sections in the first index.
if($reuters_wireitems[0]['wireitem_type'] == 'headlines') {
$top_highlight = $reuters_wireitems[0]['templates'][1]['headlines'];
$reuters_wireitem_templates = array_merge($top_highlight, $reuters_wireitem_templates);
}

foreach ($reuters_wireitem_templates as $story) {
$item['content'] = $story['story']['lede'];
$item['title'] = $story['story']['hed'];
$item['timestamp'] = $story['story']['updated_at'];
$item['uri'] = $story['template_action']['url'];

$this->items[] = $item;
}
}
class ReutersBridge extends BridgeAbstract
{
const MAINTAINER = 'hollowleviathan, spraynard, csisoap';
const NAME = 'Reuters Bridge';
const URI = 'https://reuters.com/';
const CACHE_TIMEOUT = 1800; // 30min
const DESCRIPTION = 'Returns news from Reuters';
private $feedName = self::NAME;

const ALLOWED_WIREITEM_TYPES = ['story', 'headlines'];

const ALLOWED_TEMPLATE_TYPES = ['story'];

const PARAMETERS = [
[
'feed' => [
'name' => 'News Feed',
'type' => 'list',
'exampleValue' => 'World',
'title' => 'Reuters feed. World, US, Tech...',
'values' => [
'Tech' => 'tech',
'Wire' => 'wire',
'Health' => 'health',
'Business' => 'business',
'World' => 'world',
'Politics' => 'politics',
'Science' => 'science',
'Lifestyle' => 'life',
'Energy' => 'energy',
'Aerospace and Defence' => 'aerospace',
'China' => 'china',
'Top News' => 'home/topnews',
'Lifestyle' => 'lifestyle',
'Markets' => 'markets',
'Sports' => 'sports',
'Pic of the Day' => 'pictures', // This has a different configuration than the others.
'USA News' => 'us',
],
],
],
];

private function getJson($feedname)
{
$uri = "https://wireapi.reuters.com/v8/feed/rapp/us/tabbar/feeds/$feedname";
$returned_data = getContents($uri);
return json_decode($returned_data, true);
}

public function getName()
{
return $this->feedName;
}

private function processData($data)
{
/**
* Gets a list of wire items which are groups of templates
*/
$reuters_allowed_wireitems = array_filter($data, function ($wireitem) {
return in_array(
$wireitem['wireitem_type'],
self::ALLOWED_WIREITEM_TYPES
);
});

/*
* Gets a list of "Templates", which is data containing a story
*/
$reuters_wireitem_templates = array_reduce(
$reuters_allowed_wireitems,
function (array $carry, array $wireitem) {
$wireitem_templates = $wireitem['templates'];
return array_merge(
$carry,
array_filter($wireitem_templates, function (
array $template_data
) {
return in_array(
$template_data['type'],
self::ALLOWED_TEMPLATE_TYPES
);
})
);
},
[]
);

return $reuters_wireitem_templates;
}

private function getArticle($feed_uri)
{
// This will make another request to API to get full detail of article and author's name.
$uri = "https://wireapi.reuters.com/v8$feed_uri";
$data = getContents($uri);
$process_data = json_decode($data, true);
$reuters_wireitems = $process_data['wireitems'];
$processedData = $this->processData($reuters_wireitems);

$first = reset($processedData);
$article_content = $first['story']['body_items'];
$authorlist = $first['story']['authors'];

$author = '';
$counter = 0;
foreach ($authorlist as $data) {
//Formatting author's name.
$counter++;
$name = $data['name'];
if ($counter == count($authorlist)) {
$author = $author . $name;
} else {
$author = $author . "$name, ";
}
}

$description = '';
foreach ($article_content as $content) {
$data = $content['content'];
// This will check whether that content is a image URL or not.
if (
strpos($data, '.png') !== false ||
strpos($data, '.jpg') !== false
) {
$description = $description . "<img src=\"$data\">";
} else {
if ($content['type'] == 'inline_items') {
//Fix issue with some content included brand name or company name.
$item_list = $content['items'];
$description = $description . '<p>';
foreach ($item_list as $item) {
$description = $description . $item['content'];
}
$description = $description . '</p>';
} else {
if (
strtoupper($data) == $data ||
$content['type'] == 'heading'
) {
//Add heading for any part of content served as header.
$description = $description . "<h3>$data</h3>";
} else {
$description = $description . "<p>$data</p>";
}
}
}
}

$content_detail = [
'content' => $description,
'author' => $author,
];
return $content_detail;
}

public function collectData()
{
$feed = $this->getInput('feed');
$data = $this->getJson($feed);
$reuters_wireitems = $data['wireitems'];
$this->feedName = $data['wire_name'] . ' | Reuters';
$processedData = $this->processData($reuters_wireitems);

// Merge all articles from Editor's Highlight section into existing array of templates.
$top_section = reset($reuters_wireitems);
if ($top_section['wireitem_type'] == 'headlines') {
$top_articles = $top_section['templates'][1]['headlines'];
$processedData = array_merge($top_articles, $processedData);
}

foreach ($processedData as $story) {
$item['uid'] = $story['story']['usn'];
$article_uri = $story['template_action']['api_path'];
$content_detail = $this->getArticle($article_uri);
$description = $content_detail['content'];
$author = $content_detail['author'];
$item['author'] = $author;
if (!(bool) $description) {
$description = $story['story']['lede']; // Just in case the content doesn't have anything.
}
# $description = $story['story']['lede'];
$image_url = $story['image']['url'];
if (!(bool) $image_url) {
$image_url =
'https://s4.reutersmedia.net/resources_v2/images/rcom-default.png'; //Just in case if there aren't any pictures.
csisoap marked this conversation as resolved.
Show resolved Hide resolved
}
$item['content'] = "<img src=\"$image_url\"> \n
$description";
$item['title'] = $story['story']['hed'];
$item['timestamp'] = $story['story']['updated_at'];
$item['uri'] = $story['template_action']['url'];
$this->items[] = $item;
}
}
}