Skip to content
This repository has been archived by the owner on Apr 24, 2022. It is now read-only.

Add --tstop and --tstart option preventing GPU overheating. #1146

Conversation

StefanOberhumer
Copy link
Collaborator

As we're getting summer here:

To be sure not to burn the GPUs the miner stops mining on a GPU
if a specific temperature (--tstop) is reached.
After cooling down and reaching --tstart temperature (default 40)
mining will be (re)started on the GPU.

@@ -690,6 +729,9 @@ class MinerCLI
<< " Use at your own risk! If GPU generates errored results they WILL be forwarded to the pool" << endl
<< " Not recommended at high overclock." << endl
#endif
<< " Temperature management: (implies -HWMON=0|1)" << endl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description should be set to

"Temperature management: (implies -HWMON=1)"

Would not work on without hwmon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-HWMON=0 will be automatically enabled (m_show_hwmonitors=true which is what -HWMON=0 does).

{
try
{
m_tstop = stol(argv[++i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use stoul instead of stol

{
try
{
m_tstart = stol(argv[++i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stoul instead of stol



// Sanity check --tstart/--tstop
if (m_tstop && m_tstop <= m_tstart)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement a check for tstop to have a range from tstart of, at least, 10 degrees
Otherwise a tstop of 41 and a tstart of 40 would be legit but will never start mining I assume

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do not let tstart be lower than 40.
A tstart of 10 would make no sense.

Copy link
Collaborator Author

@StefanOberhumer StefanOberhumer May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • In your example - it will mine if the temperature equals 40 degrees.
  • Forcing (tstart >= 40) and (tstart + 10 <= tstop):
    Hmmm: I like it when software does not too much automatically - the user should have the control !
    I force the two values to be in a range of 30 to 100 and 0 for tstop to disable feature again (eg --tstop 60 --tstop 0) ?

@StefanOberhumer StefanOberhumer deleted the Add-tstop-tstart-Option--PreventingOverheating branch May 24, 2018 11:16
@StefanOberhumer StefanOberhumer restored the Add-tstop-tstart-Option--PreventingOverheating branch May 24, 2018 11:43
@StefanOberhumer StefanOberhumer force-pushed the Add-tstop-tstart-Option--PreventingOverheating branch 2 times, most recently from 0055d4e to bcd9099 Compare May 24, 2018 12:33
@@ -241,6 +282,7 @@ class Miner: public Worker
std::chrono::high_resolution_clock::time_point workSwitchStart;
HwMonitorInfo m_hwmoninfo;
private:
bool m_wait_for_tstart_temp = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would make m_wait_for_tstart_temp a std::atomic as we're querying from different threads.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it must be accessed from different threads? If that's required make it an atomic_flag.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause each miner lives in it's own thread and the value is getting accessed by Farm which lives on a separate thread.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chfast , @AndreaLanfranchi
As I'm new in this topic (C++, C11) - do we really need this?
As m_wait_for_tstart_temp is just updated (modified) in update_temperature() in Miner.h which is called from the farm and I don't think its a critical section - do I have to use atomic_flag ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you have to. Otherwise this will be a race condition and Thread Sanitizer will report that as an issue.

@AndreaLanfranchi
Copy link
Collaborator

Bottomline ... a good catch.
Highly requested.
Almost ready to merge.

@@ -241,6 +282,7 @@ class Miner: public Worker
std::chrono::high_resolution_clock::time_point workSwitchStart;
HwMonitorInfo m_hwmoninfo;
private:
bool m_wait_for_tstart_temp = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it must be accessed from different threads? If that's required make it an atomic_flag.

To be sure not to burn the GPUs the miner stops mining on a GPU
if a specific temperature (--tstop) is reached.
After cooling down and reaching --tstart temperature (default 40)
mining will be (re)started on the GPU.
@StefanOberhumer StefanOberhumer force-pushed the Add-tstop-tstart-Option--PreventingOverheating branch from bcd9099 to b0eb122 Compare May 25, 2018 08:06
@StefanOberhumer
Copy link
Collaborator Author

As suggested to use std::atomic_flag for m_wait_for_tstart_temp:

  • Unlike std::atomic<bool>, std::atomic_flag does not provide load or store operations.
  • So I decided to use std::atomic<bool> for m_wait_for_tstart_temp flag.

Copy link
Contributor

@chfast chfast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!
Sorry about confusion about atomic_flag vs atomic<bool>. I though they are almost the same.

@chfast chfast merged commit 8cc88d9 into ethereum-mining:master May 25, 2018
@StefanOberhumer StefanOberhumer deleted the Add-tstop-tstart-Option--PreventingOverheating branch May 25, 2018 10:35
@ali8889
Copy link

ali8889 commented Sep 20, 2018

Goood

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants